Akari Asai
Research Scientist @ Allen Institute for AI
Incoming Assistant Professor @ Carnegie Mellon University
I am an incoming Assistant Professor at Carnegie Mellon University (Fall 2026-) Language Technologies Institute, with an affiliate appointment in the Machine Learning Department and a research scientist at OLMo @ the Allen Institute for AI (2025-2026).
I am hiring 2-3 Ph.D. students at CMU in the 2025-2026 application cycle. Please check out FAQ for more info.
I’ve completed my Ph.D. in NLP at Paul G. Allen School of Computer Science & Engineering, University of Washington. I am fortunate to be advised by Prof. Hannaneh Hajishirzi. I was also spending time at Meta AI Research as a visiting student researcher, under the supervision of Dr. Wen-tau Yih. My Ph.D. pioneered Retrieval-Augmented LMs, LMs that integrate large-scale text data via retrieval during inference (thesis (PDF), video (youtube).)
Prior to joining UW, I obtained a B.E. in Electrical Engineering and Computer Science from The University of Tokyo, Japan.
Current Research Focus: My research focuses on natural language processing and machine learning, with a particular emphasis on large language models (LLMs). I’m interested in building LMs and agents that are more reliable, modular, and open, aimed at real-world impact in science, code, and global information access.
-
Developing Augmented LMs: We design, train, and deploy augmented LMs and agents that collaborate with complementary modules—retrieval, tool use, multi-LM coordination, and more—moving beyond the limits of scaling a single monothilic model and introduce new training and inference algorithms for those methods. Recent work includes advanced retrieval-augmented LMs such as Self-RAG, and DR Tulu, the first end-to-end open deep research agent for open-ended, long-form tasks trained with reinforcement learning with evolving rubrics. We also tackle system-level challenges for scalability and efficiency (e.g., MassiveDS, BPR) and extend these capabilities to multimodal settings (Pangea, MM-RAG NeurIPS Competition).
-
Understanding and Mitigating Failure Modes of LMs: We systematically investigate where and why LMs fail, including hallucinations, copyright infringements, and unreliable reasoning, and design mechanisms to improve their reliability and safety. Projects such as When Not to Trust LMs, copyright–utility trade-offs of LMs in CopyBench, and analyses of capability–hallucination trade-offs in Binary RAR exemplify our efforts to make LMs more trustworthy and robust.
-
Deploying Augmented LMs Ms in High-Impact Domains: We apply our methods to real-world challenges that demand factuality, transparency, and accessibility. Examples include AI for science (OpenScholar, used by tens of thousands of scientists for literature synthesis), AI for code CodeRAGBench, and AI for linguistic equity (XORQA, AfriQA, CORA), broadening global access to reliable information.
Selected recognitions include MIT Technology Review 35 Innovators Under 35 (2025 Global & 2024 Japan), Forbes 30 Under 30 Asia in Science 2025, EECS Rising Stars 2022, and the IBM Global Ph.D. Fellows 2022-202. Our work has been covered by Forbes, Nature News, and MIT Technology Review, and is used in libraries such as Hugging Face, LlamaIndex, and LangChain. Most recently, the Ai2 OpenScholar Public Demo has supported 50k researchers across scientific disciplines in synthesizing literature.
Public office hours and application materials: To help lower barriers to starting research, pursuing a Ph.D. in this field or job search, I host weekly office hours open to all every Friday. Feel free to sign up via (please sign up from Google Calendar!).
Inspired by many wonderful friends who have shared their own materials to promote equity and access, I’ve also made my past application materials available:
- Academic job application (2024): [Research Statement], [Teaching Statement], [Diversity Statement], [Job Talk Slides], [Job Talk Video (defense recording)]
- EECS Rising Stars (2022): [Research Statement]
- PhD application (2018): [SoP draft] (Note: This is a near-final draft, as I don’t have access to the original version anymore! For examples of CS Statements of Purpose, I recommend checking out cs-sop, which includes many CS SoP of previous applicants!)
news
| Nov 19, 2025 | Super excited to share DR Tulu - an open, end-to-end trained deep research agent for long-form, real-world research tasks. We introduce a new RL recipe, Reinforcement Learning with Evolving Rubrics (RLER), to tackle the inherently hard-to-verify nature of deep research. Check out our paper and a static demo. A live demo is coming soon so please stay tuned! |
|---|---|
| Oct 02, 2025 | I spoke with the Delta Institue Podcast about my path to CS/NLP, recent progress in augmented LMs & agents, and remaining challenges. |
| Sep 30, 2025 | I gave an invited lecture on retrieval and retrieval-augmented LMs at CMU Advanced NLP and LLMs! Slide and Lecture video are publicly available. |
| Sep 09, 2025 | OpenScholar has been highlighted in Nature News - “Can researchers stop AI making up citations?”. |
| Sep 08, 2025 | Honored to be named to the MIT Techreview Innovators Under 35! |
selected publications
See my full publications at the publication page!