Publications

* indicates equal contribution.

2024

  1. OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs
    \underline\bf Asai ,  Jacqueline He ,  Rulin Shao ,  Weijia Shi ,  Amanpreet Singh , and 20 more authors
    Arxiv, 2024
  2. Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
    Xiang Yue* ,  Yueqi Song* ,  Akari Asai ,  Seungone Kim ,  Jean Dieu Nyandwi , and 5 more authors
    Preprint, 2024
  3. Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
    Rulin Shao ,  Jacqueline He ,  Akari Asai ,  Weijia Shi ,  Tim Dettmers , and 3 more authors
    In Advances in Neural Information Processing Systems (NeurIPS) , 2024
  4. CodeRAG-Bench: Can Retrieval Augment Code Generation?
    Zora Zhiruo Wang* ,  Akari Asai* ,  Xinyan Velocity Yu ,  Frank F Xu ,  Yiqing Xie , and 2 more authors
    Preprint, 2024
  5. CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
    Tong Chen ,  Akari Asai* ,  Niloofar Mireshghallah* ,  Sewon Min ,  James Grimmelmann , and 4 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2024
  6. Fine-grained Hallucination Detection and Editing for Language Models
    Abhika Mishra ,  Akari Asai ,  Yizhong Wang ,  Vidhisha Balachandran ,  Graham Neubig , and 2 more authors
    In Conference on Language Modeling (COLM) , 2024
  7. BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer
    Akari Asai ,  Sneha Kudugunta ,  Xinyan Velocity Yu ,  Terra Blevins ,  Hila Gonen , and 4 more authors
    In 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL; Oral) , 2024
  8. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
    Akari Asai ,  Zeqiu Wu ,  Yizhong Wang ,  Avirup Sil ,  and  Hannaneh Hajishirzi
    In The Twelfth International Conference on Learning Representations (ICLR; Oral, Top 1%) , 2024
  9. Reliable, Adaptable, and Attributable Language Models with Retrieval
    Akari Asai ,  Zexuan Zhong ,  Danqi Chen ,  Pang Wei Koh ,  Luke Zettlemoyer , and 2 more authors
    arXiv preprint, 2024

2023

  1. RealTime QA: What’s the Answer Right Now?
    Jungo Kasai ,  Keisuke Sakaguchi ,  Yoichi Takahashi ,  Ronan Le Bras ,  Akari Asai , and 5 more authors
    In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track , 2023
  2. TaskWeb: Selecting Better Source Tasks for Multi-task NLP
    Joongwon Kim ,  Akari Asai ,  Gabriel Ilharco ,  and  Hannaneh Hajishirzi
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2023
  3. How to Train Your Dragon: Diverse Augmentation Towards Generalizable Dense Retrieval
    Sheng-Chieh Lin ,  Akari Asai ,  Minghan Li ,  Barlas Oguz ,  Jimmy Lin , and 3 more authors
    In Findings of the Association for Computational Linguistics: EMNLP , 2023
  4. Cross-lingual Open-Retrieval Question Answering for African Languages
    Odunayo Ogundepo ,  Tajuddeen Gwadabe ,  Clara Rivera ,  Jonathan Clark ,  Sebastian Ruder , and 39 more authors
    In Findings of the Association for Computational Linguistics: EMNLP 2023 (Findings Spotlight) , 2023
  5. xPQA: Cross-Lingual Product Question Answering in 12 Languages
    Xiaoyu Shen ,  Akari Asai ,  Bill Byrne ,  and  Adria De Gispert
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL; Industry Track) , 2023
  6. When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
    Alex Mallen* ,  Akari Asai* ,  Victor Zhong ,  Rajarshi Das ,  Daniel Khashabi , and 1 more author
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL; Oral, Best Video Paper Award – Most Viewed) , 2023
  7. Retrieval-based Language Models and Applications
    Akari Asai ,  Sewon Min ,  Zexuan Zhong ,  and  Danqi Chen
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Tutorial) , 2023
  8. Task-aware Retrieval with Instructions
    Akari Asai ,  Timo Schick ,  Patrick Lewis ,  Xilun Chen ,  Gautier Izacard , and 3 more authors
    In Findings of the Association for Computational Linguistics: ACL 2023 (Findings Spotlight) , 2023

2022

  1. Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
    Akari Asai ,  Matt Gardner ,  and  Hannaneh Hajishirzi
    In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL; Oral) , 2022
  2. ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
    Akari Asai ,  Mohammadreza Salehi ,  Matthew Peters ,  and  Hannaneh Hajishirzi
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2022
  3. Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources
    Xinyan Yu* ,  Akari Asai* ,  Trina Chatterjee ,  Junjie Hu ,  and  Eunsol Choi
    In Findings of the Association for Computational Linguistics: EMNLP , 2022
  4. Proceedings of the Workshop on Multilingual Information Access (MIA)
    2022
  5. MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages
    Akari Asai ,  Shayne Longpre ,  Jungo Kasai ,  Chia-Hsuan Lee ,  Rui Zhang , and 4 more authors
    In Proceedings of the Workshop on Multilingual Information Access (MIA) , 2022

2021

  1. One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval
    Akari Asai ,  Xinyan Yu ,  Jungo Kasai ,  and  Hanna Hajishirzi
    In Advances in Neural Information Processing Systems (NeurIPS) , 2021
  2. Challenges in Information-Seeking QA: Unanswerable Questions and Paragraph Retrieval
    Akari Asai ,  and  Eunsol Choi
    In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL) , 2021
  3. Efficient Passage Retrieval with Hashing for Open-domain Question Answering
    Ikuya Yamada ,  Akari Asai ,  and  Hannaneh Hajishirzi
    In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL short) , 2021
  4. XOR QA: Cross-lingual Open-Retrieval Question Answering
    Akari Asai ,  Jungo Kasai ,  Jonathan Clark ,  Kenton Lee ,  Eunsol Choi , and 1 more author
    In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL; Oral) , 2021

2020

  1. Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia
    Ikuya Yamada ,  Akari Asai ,  Jin Sakuma ,  Hiroyuki Shindo ,  Hideaki Takeda , and 2 more authors
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP System Demonstrations) , 2020
  2. Logic-Guided Data Augmentation and Regularization for Consistent Question Answering
    Akari Asai ,  and  Hannaneh Hajishirzi
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) , 2020
  3. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
    Ikuya Yamada ,  Akari Asai ,  Hiroyuki Shindo ,  Hideaki Takeda ,  and  Yuji Matsumoto
    In Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2020
  4. Learning to retrieve reasoning paths over wikipedia graph for question answering
    Akari Asai ,  Kazuma Hashimoto ,  Hannaneh Hajishirzi ,  Richard Socher ,  and  Caiming Xiong
    In International Conference on Learning Representations (ICLR) , 2020

2018

  1. HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
    Akari Asai ,  Sara Evensen ,  Behzad Golshan ,  Alon Halevy ,  Vivian Li , and 5 more authors
    In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) , 2018