{"work":{"id":"a397ddf8-b0b7-4e32-9d59-fb6ea67ac287","openalex_id":null,"doi":null,"arxiv_id":"2002.08909","raw_key":null,"title":"REALM: Retrieval-Augmented Language Model Pre-Training","authors":null,"authors_text":"Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang","year":2020,"venue":"cs.CL","abstract":"Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts.\n  To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents.\n  We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity.","external_url":"https://arxiv.org/abs/2002.08909","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T08:26:48.718591+00:00","pith_arxiv_id":"2002.08909","created_at":"2026-05-10T06:16:20.729067+00:00","updated_at":"2026-06-05T21:23:00.469572+00:00","title_quality_ok":true,"display_title":"REALM: Retrieval-Augmented Language Model Pre-Training","render_title":"REALM: Retrieval-Augmented Language Model Pre-Training"},"hub":{"state":{"work_id":"a397ddf8-b0b7-4e32-9d59-fb6ea67ac287","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":29,"external_cited_by_count":null,"distinct_field_count":7,"first_pith_cited_at":"2020-02-10T18:55:58+00:00","last_pith_cited_at":"2026-05-21T09:06:13+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-08T04:42:50.273751+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":4},{"context_role":"method","n":1}],"polarity_counts":[{"context_polarity":"background","n":4},{"context_polarity":"use_method","n":1}],"runs":{},"summary":{},"graph":{},"authors":[]}}