Akari Asai

Research Scientist @ Allen Institute for AI
Incoming Assistant Professor @ Carnegie Mellon University

prof_pic.jpg

I am an incoming Assistant Professor at Carnegie Mellon University (Fall 2026-), affiliated with the Language Technologies Institute and (by courtesy) the Machine Learning Department and a research scientist at the Allen Institute for AI (2025-2026).

I’ve completed my Ph.D. in NLP at Paul G. Allen School of Computer Science & Engineering, University of Washington. I am fortunate to be advised by Prof. Hannaneh Hajishirzi. I was also spending time at Meta AI Research as a visiting student researcher, under the supervision of Dr. Wen-tau Yih. Prior to joining UW, I obtained a B.E. in Electrical Engineering and Computer Science from The University of Tokyo, Japan.

My research focuses on natural language processing and machine learning, with a particular emphasis on large language models (LLMs). I investigate the core limitations of LLMs—such as hallucinations—that cannot be overcome by scaling alone. To address these challenges, my Ph.D. pioneered Retrieval-Augmented LMs, a novel class of LLMs that integrate large-scale text data via retrieval during inference. My PhD thesis is available: thesis (PDF), vieo (youtube). In summary, my Ph.D. focused on

My work has received multiple paper awards at conferences like ACL and NeurIPS workshop, and has been featured in major media outlets such as Forbes and MIT Technology Review. I’m honored to be named among the Forbes 30 Under 30 Asia in Science , MIT Technology Review Innovators Under 35 from Japan (2024), EECS Rising Stars (2022), and the IBM Global Ph.D. Fellows (2022-2023). My work is now integrated into major libraries like Hugging Face, LlamaIndex and LangChain, and used in multiple real-world systems, such as COVID-19 Research Search. Most recently, we released Ai2 OpenScholar Public Demo, assisting more than 30k scientists across scientific disciplines to synthesize scientific literature more effectively and efficiently.

Public office hours and application materials:

To help lower barriers to starting research, pursuing a Ph.D. in this field or job search, I host weekly office hours open to all every Friday. Feel free to sign up via (please sign up from Google Calendar!).

Inspired by many wonderful friends who have shared their own materials to promote equity and access, I’ve also made my past application materials available:

news

Jul 15, 2025 I’ll be joining Carnegie Mellon University as an Assistant Professor in Fall 2026, affiliated with the Language Technologies Institute and (by courtesy) the Machine Learning Department! From July 2025 to August 2026, I’ll be a Research Scientist at the Allen Institute for AI!
Jun 15, 2025 I’ve completed my Ph.D! My Ph.D. thesis is available here and you can see the video of my defense on Youtoube.
May 30, 2025 I’m organizing the COLM 2025 Workshop on LLMs for Science as well as the NeurIPS 2025 Competition on Retrieval-Augmented Generation in the Real World. Stay tuned for more updates - we’d love to have you involved!
May 15, 2025 Honored to be named to the Forbes 30 Under 30 Asia 2025 in Science!
May 02, 2025 I gave invited talks at NAACL Repl4NLP and Foundation Models for Science Workshop at Flatiron Institute.

selected publications

See my full publications at the publication page!

  1. OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs
    Akari Asai ,  Jacqueline He ,  Rulin Shao ,  Weijia Shi ,  Amanpreet Singh , and 20 more authors
    Preprint, 2024
  2. Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
    Rulin Shao ,  Jacqueline He ,  Akari Asai ,  Weijia Shi ,  Tim Dettmers , and 3 more authors
    In Advances in Neural Information Processing Systems (NeurIPS) , 2024
  3. Fine-grained Hallucination Detection and Editing for Language Models
    Abhika Mishra ,  Akari Asai ,  Yizhong Wang ,  Vidhisha Balachandran ,  Graham Neubig , and 2 more authors
    In Conference on Language Modeling (COLM) , 2024
  4. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
    Akari Asai ,  Zeqiu Wu ,  Yizhong Wang ,  Avirup Sil ,  and  Hannaneh Hajishirzi
    In The Twelfth International Conference on Learning Representations (ICLR; Oral, Top 1%) , 2024
  5. Reliable, Adaptable, and Attributable Language Models with Retrieval
    Akari Asai ,  Zexuan Zhong ,  Danqi Chen ,  Pang Wei Koh ,  Luke Zettlemoyer , and 2 more authors
    arXiv preprint, 2024
  6. When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
    Alex Mallen* ,  Akari Asai* ,  Victor Zhong ,  Rajarshi Das ,  Daniel Khashabi , and 1 more author
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL; Oral, Best Video Paper Award – Most Viewed) , 2023
  7. Task-aware Retrieval with Instructions
    Akari Asai ,  Timo Schick ,  Patrick Lewis ,  Xilun Chen ,  Gautier Izacard , and 3 more authors
    In Findings of the Association for Computational Linguistics: ACL 2023 (Findings Spotlight) , 2023
  8. Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
    Akari Asai ,  Matt Gardner ,  and  Hannaneh Hajishirzi
    In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL; Oral) , 2022
  9. One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval
    Akari Asai ,  Xinyan Yu ,  Jungo Kasai ,  and  Hanna Hajishirzi
    In Advances in Neural Information Processing Systems (NeurIPS) , 2021
  10. XOR QA: Cross-lingual Open-Retrieval Question Answering
    Akari Asai ,  Jungo Kasai ,  Jonathan Clark ,  Kenton Lee ,  Eunsol Choi , and 1 more author
    In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL; Oral) , 2021
  11. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
    Ikuya Yamada ,  Akari Asai ,  Hiroyuki Shindo ,  Hideaki Takeda ,  and  Yuji Matsumoto
    In Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2020
  12. Learning to retrieve reasoning paths over wikipedia graph for question answering
    Akari Asai ,  Kazuma Hashimoto ,  Hannaneh Hajishirzi ,  Richard Socher ,  and  Caiming Xiong
    In International Conference on Learning Representations (ICLR) , 2020