arxiv: 2004.04906 · v3 · submitted 2020-04-10 · 💻 cs.CL

Recognition: 1 theorem link

Dense Passage Retrieval for Open-Domain Question Answering

Vladimir Karpukhin , Barlas O\u{g}uz , Sewon Min , Patrick Lewis , Ledell Wu , Sergey Edunov , Danqi Chen , Wen-tau Yih

Authors on Pith no claims yet

Pith reviewed 2026-05-15 21:21 UTC · model grok-4.3

classification 💻 cs.CL

keywords dense retrievalopen domain QAdual encoderpassage retrievalBM25question answeringneural embeddingsretrieval accuracy

0 comments

The pith

Dense vector embeddings from a dual-encoder model outperform BM25 by 9-19 percent in top-20 passage retrieval for open-domain question answering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that dense representations can handle passage retrieval for open-domain question answering without relying on sparse vector models like TF-IDF or BM25. Embeddings are trained using a dual-encoder setup on limited question-passage pairs, yet they deliver 9 to 19 percent better top-20 retrieval accuracy across datasets. The improved retrieval directly boosts end-to-end QA performance, achieving new state-of-the-art results on several benchmarks. Readers should care because this simplifies and strengthens the retrieval step that underpins scalable question answering over large text collections.

Core claim

Open-domain question answering relies on efficient passage retrieval, traditionally done with sparse models such as TF-IDF or BM25. We demonstrate that retrieval can instead be implemented using dense representations alone. These embeddings are learned from a small number of questions and passages using a simple dual-encoder framework. When tested on multiple open-domain QA datasets, the dense retriever outperforms a strong Lucene-BM25 system by 9 to 19 percent absolute in top-20 passage retrieval accuracy. This retrieval improvement allows our end-to-end QA system to reach new state-of-the-art performance on the benchmarks.

What carries the argument

A dual-encoder model that independently embeds questions and passages into a shared dense vector space for similarity-based retrieval.

If this is right

Higher top-20 retrieval accuracy leads to more relevant contexts being available for the reader module in QA systems.
The method can be integrated into existing QA pipelines to boost overall accuracy.
It establishes new performance records on multiple standard open-domain QA benchmarks.
Dense retrieval becomes a viable practical alternative to sparse indexing methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Neural dense retrieval may reduce dependence on exact term overlap, capturing semantic matches instead.
The approach could be extended by combining dense and sparse signals for hybrid retrieval.
Generalization from small training sets implies that the model learns robust semantic features applicable to unseen queries.

Load-bearing premise

Embeddings trained on a limited set of questions and passages will generalize well to the broader range of queries and documents seen during testing.

What would settle it

Observing no improvement or a decrease in top-20 passage retrieval accuracy for the dense model compared to BM25 on a standard open-domain QA test set would falsify the performance claim.

read the original abstract

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows a simple dual-encoder dense retriever trained on QA pairs beats strong BM25 by 9-19% top-20 accuracy on open-domain datasets and lifts end-to-end QA results.

read the letter

The main point is that dense retrieval works at scale here. A dual-encoder setup, one tower for questions and one for passages, trained with contrastive loss on QA pairs, pulls ahead of Lucene-BM25 by a clear margin on top-20 retrieval across Natural Questions, TriviaQA, and WebQuestions. That gap then carries through to better full QA numbers when the retrieved passages feed a reader model. The method itself is straightforward and the indexing step uses standard FAISS, so the gains come from the learned embeddings rather than fancy tricks. Prior dense retrieval papers existed, but none had shown this size of improvement in the open-domain setting with held-out test sets. The experiments run on multiple benchmarks and the numbers line up consistently, which gives the central claim real weight. The training uses in-batch negatives plus hard negatives mined from BM25, and the paper reports the exact data splits. Soft spots are minor but real: no error bars on the main tables, limited ablation on the negative sampling choices, and the training data construction gets less detail than ideal. Those gaps leave some room for hidden effects, though later replications have kept the gains. This is for anyone building retrieval-augmented QA or RAG systems who needs a stronger starting retriever than BM25. A reader who wants reproducible baselines and practical numbers will get direct value. The work is grounded enough to deserve a serious referee rather than a desk reject.

Referee Report

1 major / 2 minor

Summary. The paper proposes a dense passage retrieval method for open-domain QA based on a dual-encoder framework that learns embeddings from a modest number of question-passage pairs. It reports that the resulting retriever outperforms a strong Lucene-BM25 baseline by 9-19% absolute top-20 accuracy across several QA datasets and, when plugged into an end-to-end reader, yields new state-of-the-art results on multiple open-domain QA benchmarks.

Significance. If the empirical gains hold, the work provides a practical demonstration that supervised dense retrieval can substantially surpass classical sparse methods without requiring hand-crafted features or inverted indexes, thereby shifting the default retrieval component in open-domain QA pipelines toward learned embeddings.

major comments (1)

[Section 3 (Training) and experimental setup] The training procedure (negative sampling strategy and construction of the training set) is not ablated; without these controls it remains possible that the reported 9-19% gains partly reflect dataset-specific selection effects rather than the dual-encoder architecture itself.

minor comments (2)

[Abstract] The abstract states gains of '9%-19%' but does not report per-dataset numbers, standard deviations, or confidence intervals; a table with these statistics would make the strength of the improvement clearer.
[Section 2] Notation for the dual-encoder scoring function and the contrastive loss should be introduced once in a single equation block rather than scattered across prose.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive summary and recommendation for minor revision. We address the major comment below.

read point-by-point responses

Referee: [Section 3 (Training) and experimental setup] The training procedure (negative sampling strategy and construction of the training set) is not ablated; without these controls it remains possible that the reported 9-19% gains partly reflect dataset-specific selection effects rather than the dual-encoder architecture itself.

Authors: We agree that an explicit ablation of negative sampling and training-set construction would strengthen the claims. Our main experiments compare the trained dual-encoder against a strong unsupervised BM25 baseline on the same corpora, which already isolates the benefit of learned dense representations. Nevertheless, to directly address the concern, we will add a new ablation subsection in the revised manuscript that reports retrieval accuracy when training with (i) random negatives, (ii) BM25-retrieved hard negatives, and (iii) varying numbers of negatives per question. These additional controls will clarify how much of the observed 9-19% improvement is attributable to the dual-encoder architecture versus the particular negative-sampling procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper trains a dual-encoder model via standard contrastive loss on QA pairs to produce dense passage embeddings, then measures top-k retrieval accuracy on held-out test portions of standard benchmarks (Natural Questions, TriviaQA, etc.). No equation or claim reduces the reported 9-19% gains to a fitted parameter by construction, nor does any load-bearing step rely on a self-citation chain that is itself unverified. The evaluation is ordinary supervised held-out testing; the derivation chain (indexing, retrieval, end-to-end QA) remains externally falsifiable and does not collapse into its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the standard assumption that cosine similarity in a learned embedding space captures relevance, plus the usual supervised contrastive training assumptions. No new entities are postulated.

axioms (1)

domain assumption Embeddings learned via contrastive loss on QA pairs will place relevant passages near their questions in vector space
Invoked in the dual-encoder training description in the abstract

pith-pipeline@v0.9.0 · 5432 in / 1185 out tokens · 32529 ms · 2026-05-15T21:21:33.779649+00:00 · methodology

discussion (0)

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

BOOKMARKS: Efficient Active Storyline Memory for Role-playing
cs.CL 2026-05 unverdicted novelty 7.0

BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.
Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems
cs.IR 2026-04 unverdicted novelty 7.0

Agentic search narrows the gap between dense RAG and GraphRAG but does not remove GraphRAG's advantage on complex multi-hop reasoning.
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
cs.CL 2024-04 conditional novelty 7.0

A panel of smaller diverse LLMs outperforms a single large model as an evaluator of generations, showing less intra-model bias and over 7x lower cost.
C-Pack: Packed Resources For General Chinese Embeddings
cs.CL 2023-09 accept novelty 7.0

C-Pack releases a new Chinese embedding benchmark, large training dataset, and optimized models that outperform priors by up to 10% on C-MTEB while also delivering English SOTA results.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
cs.CL 2020-05 accept novelty 7.0

RAG models set new state-of-the-art results on open-domain QA by retrieving Wikipedia passages and conditioning a generative model on them, while also producing more factual text than parametric baselines.
Task-Adaptive Embedding Refinement via Test-time LLM Guidance
cs.CL 2026-05 unverdicted novelty 6.0

Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.
From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction
cs.AI 2026-04 unverdicted novelty 6.0

Schema-aware iterative extraction turns AI memory into a verified system of record, reaching 90-97% accuracy on extraction and end-to-end memory benchmarks where retrieval baselines score 80-87%.
EHRAG: Bridging Semantic Gaps in Lightweight GraphRAG via Hybrid Hypergraph Construction and Retrieval
cs.AI 2026-04 unverdicted novelty 6.0

EHRAG constructs structural hyperedges from sentence co-occurrence and semantic hyperedges from entity embedding clusters, then applies hybrid diffusion plus topic-aware PPR to retrieve top-k documents, outperforming ...
Knowledge Is Not Static: Order-Aware Hypergraph RAG for Language Models
cs.CL 2026-04 unverdicted novelty 6.0

OKH-RAG represents knowledge as ordered hyperedges and retrieves coherent interaction sequences via a learned transition model, outperforming permutation-invariant RAG baselines on order-sensitive QA tasks.
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
cs.CL 2024-05 accept novelty 6.0

NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
cs.CL 2023-10 unverdicted novelty 6.0

Self-RAG trains LLMs to adaptively retrieve passages on demand and self-critique using reflection tokens, outperforming ChatGPT and retrieval-augmented Llama2 on QA, reasoning, and fact verification.
MemGPT: Towards LLMs as Operating Systems
cs.AI 2023-10 unverdicted novelty 6.0

MemGPT uses OS-inspired virtual context management to extend LLM context windows for large document analysis and long-term multi-session chat.
LaMDA: Language Models for Dialog Applications
cs.CL 2022-01 unverdicted novelty 6.0

LaMDA shows that fine-tuning on human-value annotations and consulting external knowledge sources significantly improves safety and factual grounding in large dialog models beyond what scaling alone achieves.
Unsupervised Dense Information Retrieval with Contrastive Learning
cs.IR 2021-12 unverdicted novelty 6.0

Contrastive learning trains unsupervised dense retrievers that beat BM25 on most BEIR datasets and support cross-lingual retrieval across scripts.
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
cs.CL 2020-02 accept novelty 6.0

Fine-tuned language models store knowledge in parameters to answer questions competitively with retrieval-based open-domain QA systems.
Not All RAGs Are Created Equal: A Component-Wise Empirical Study for Software Engineering Tasks
cs.SE 2026-05 unverdicted novelty 5.0

Retriever-side choices, particularly the retrieval algorithm, exert more influence on RAG performance than generator selection across code generation, summarization, and repair tasks.
Learning from AVA: Early Lessons from a Curated and Trustworthy Generative AI for Policy and Development Research
cs.HC 2026-04 unverdicted novelty 5.0

AVA is a specialized GenAI platform for development policy research that provides verifiable syntheses from World Bank reports and is associated with 2.4-3.9 hours of weekly time savings in a large-scale user evaluation.
Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering
cs.CL 2026-04 unverdicted novelty 4.0

Entity-based chunk filtering reduces RAG vector index size by 25-36% with retrieval quality near baseline levels.
Unified Supervision for Walmart's Sponsored Search Retrieval via Joint Semantic Relevance and Behavioral Engagement Modeling
cs.IR 2026-04 unverdicted novelty 4.0

A hybrid supervision method for bi-encoder retrievers combines graded relevance from teacher models, production retrieval priors, and selective engagement to improve relevance and NDCG over Walmart's current sponsored...

Reference graph

Works this paper leans on

113 extracted references · 113 canonical work pages · cited by 19 Pith papers · 7 internal anchors

[1]

Passage Re-ranking with

Nogueira, Rodrigo and Cho, Kyunghyun , journal=. Passage Re-ranking with

work page
[2]

2020 , booktitle =

Khattab, Omar and Zaharia, Matei , title =. 2020 , booktitle =

work page 2020
[3]

Relevance-guided Supervision for OpenQA with

Khattab, Omar and Potts, Christopher and Zaharia, Matei , journal=. Relevance-guided Supervision for OpenQA with

work page
[4]

The probabilistic relevance framework:

Robertson, Stephen and Zaragoza, Hugo , journal=. The probabilistic relevance framework:

work page
[5]

Learning to Retrieve Reasoning Paths over

Asai, Akari and Hashimoto, Kazuma and Hajishirzi, Hannaneh and Socher, Richard and Xiong, Caiming , booktitle=iclr, year=. Learning to Retrieve Reasoning Paths over

work page
[6]

A Discrete Hard

Min, Sewon and Chen, Danqi and Hajishirzi, Hannaneh and Zettlemoyer, Luke , booktitle=emnlp, year=. A Discrete Hard

work page
[7]

2019 , journal=

Min, Sewon and Chen, Danqi and Zettlemoyer, Luke and Hajishirzi, Hannaneh , title=. 2019 , journal=

work page 2019
[8]

End-to-End Open-Domain Question Answering with BERTserini , author=

work page
[9]

Chen, Danqi and Fisch, Adam and Weston, Jason and Bordes, Antoine , booktitle=acl, pages=. Reading

work page
[10]

Revealing the Importance of Semantic Retrieval for Machine Reading at Scale , author =

work page
[11]

Rajpurkar, Pranav and Zhang, Jian and Lopyrev, Konstantin and Liang, Percy , booktitle=emnlp, pages=

work page
[12]

Introduction to ``

Ferrucci, David A , journal=. Introduction to ``. 2012 , publisher=

work page 2012
[13]

ACM Transactions on Information Systems (TOIS) , volume=

Performance issues and error analysis in an open-domain question answering system , author=. ACM Transactions on Information Systems (TOIS) , volume=. 2003 , publisher=

work page 2003
[14]

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle=naacl, year=

work page
[15]

Multi-passage

Wang, Zhiguo and Ng, Patrick and Ma, Xiaofei and Nallapati, Ramesh and Xiang, Bing , booktitle=emnlp, year=. Multi-passage

work page
[16]

Billion-scale similarity search with

Johnson, Jeff and Douze, Matthijs and J. Billion-scale similarity search with. ArXiv , volume=

work page
[17]

Latent Retrieval for Weakly Supervised Open Domain Question Answering , author =

work page
[18]

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index , author =

work page
[19]

Voorhees, Ellen M , booktitle=. The

work page
[20]

Natural Questions: a Benchmark for Question Answering Research , author =

work page
[21]

Semantic parsing on

Berant, Jonathan and Chou, Andrew and Frostig, Roy and Liang, Percy , booktitle=emnlp, year=. Semantic parsing on

work page
[22]

International Conference of the Cross-Language Evaluation Forum for European Languages , pages=

Modeling of the question answering task in the yodaqa system , author=. International Conference of the Cross-Language Evaluation Forum for European Languages , pages=. 2015 , organization=

work page 2015
[23]

T rivia QA : A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

Joshi, Mandar and Choi, Eunsol and Weld, Daniel and Zettlemoyer, Luke. T rivia QA : A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. 2017

work page 2017
[24]

Communications of the ACM , volume=

Natural language question-answering systems: 1969 , author=. Communications of the ACM , volume=. 1970 , publisher=

work page 1969
[25]

and Wolf, Alice K

Green,Jr., Bert F. and Wolf, Alice K. and Chomsky, Carol and Laughery, Kenneth , title =. Papers Presented at the May 9-11, 1961, Western Joint IRE-AIEE-ACM Computer Conference , series =. 1961 , location =

work page 1961
[26]

Proceedings of the June 4-8, 1973, national computer conference and exposition , pages=

Progress in natural language understanding: an application to lunar geology , author=. Proceedings of the June 4-8, 1973, national computer conference and exposition , pages=. 1973 , organization=

work page 1973
[27]

Empirical Methods in Natural Language Processing (EMNLP) , pages=

An analysis of the AskMSR question-answering system , author=. Empirical Methods in Natural Language Processing (EMNLP) , pages=

work page
[28]

, title =

Kwok, Cody and Etzioni, Oren and Weld, Daniel S. , title =. ACM Trans. Inf. Syst. , issue_date =. 2001 , issn =

work page 2001
[29]

Proceedings of the 19th international conference on Computational linguistics-Volume 1 , pages=

Learning question classifiers , author=. Proceedings of the 19th international conference on Computational linguistics-Volume 1 , pages=. 2002 , organization=

work page 2002
[30]

Logic Form Transformation of W ord N et and its Applicability to Question Answering

Moldovan, Dan and Rus, Vasile. Logic Form Transformation of W ord N et and its Applicability to Question Answering. 2001

work page 2001
[31]

Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval , series =

Tellex, Stefanie and Katz, Boris and Lin, Jimmy and Fernandes, Aaron and Marton, Gregory , title =. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval , series =. 2003 , isbn =. doi:10.1145/860435.860445 , acmid =

work page doi:10.1145/860435.860445 2003
[32]

and Renshaw, Erin

Richardson, Matthew and Burges, Christopher J.C. and Renshaw, Erin. MCT est: A Challenge Dataset for the Open-Domain Machine Comprehension of Text. 2013

work page 2013
[33]

2019 , volume=

Zhen-Zhong Lan and Mingda Chen and Sebastian Goodman and Kevin Gimpel and Piyush Sharma and Radu Soricut , journal=. 2019 , volume=

work page 2019
[34]

2019 , volume=

Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and Luke Zettlemoyer and Veselin Stoyanov , journal=. 2019 , volume=

work page 2019
[35]

Journal of the American society for information science , volume=

Indexing by latent semantic analysis , author=. Journal of the American society for information science , volume=. 1990 , publisher=

work page 1990
[36]

Signature verification using a ``

Bromley, Jane and Guyon, Isabelle and LeCun, Yann and S. Signature verification using a ``. NIPS , pages=

work page
[37]

Learning a similarity metric discriminatively, with application to face verification , author=

work page
[38]

Learning discriminative projections for text similarity measures , author=

work page
[39]

Learning deep structured semantic models for

Huang, Po-Sen and He, Xiaodong and Gao, Jianfeng and Deng, Li and Acero, Alex and Heck, Larry , booktitle=cikm, pages=. Learning deep structured semantic models for

work page
[40]

Learning Dense Representations for Entity Retrieval

Gillick, Daniel and Kulkarni, Sayali and Lansing, Larry and Presta, Alessandro and Baldridge, Jason and Ie, Eugene and Garcia-Olano, Diego. Learning Dense Representations for Entity Retrieval. 2019

work page 2019
[41]

Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , author=

work page
[42]

ArXiv , volume=

Efficient natural language response suggestion for smart reply , author=. ArXiv , volume=

work page
[43]

ArXiv , year=

End-to-End Retrieval in Continuous Space , author=. ArXiv , year=

work page
[44]

Ahmad, Amin and Constant, Noah and Yang, Yinfei and Cer, Daniel , journal=

work page
[45]

Empirical Methods in Natural Language Processing (EMNLP) , month =

Yih, Wen-tau , title =. Empirical Methods in Natural Language Processing (EMNLP) , month =. 2009 , address =

work page 2009
[46]

Contextualized Sparse Representation with Rectified

Lee, Jinhyuk and Seo, Minjoon and Hajishirzi, Hannaneh and Kang, Jaewoo , journal=. Contextualized Sparse Representation with Rectified

work page
[47]

Foundations and Trends in Machine Learning , volume=

Metric learning: A survey , author=. Foundations and Trends in Machine Learning , volume=. 2013 , publisher=

work page 2013
[48]

Denoising distantly supervised open-domain question answering , author=

work page
[49]

Wang, Shuohang and Yu, Mo and Guo, Xiaoxiao and Wang, Zhiguo and Klinger, Tim and Zhang, Wei and Chang, Shiyu and Tesauro, Gerry and Zhou, Bowen and Jiang, Jing , booktitle=AAAI, year=. R\^

work page
[50]

ArXiv , volume=

How Much Knowledge Can You Pack Into the Parameters of a Language Model? , author=. ArXiv , volume=

work page
[51]

Guu, Kelvin and Lee, Kenton and Tung, Zora and Pasupat, Panupong and Chang, Ming-Wei , journal=

work page
[52]

Learning and inference via maximum inner product search , author=

work page
[53]

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

Maximum inner-product search using cone trees , author=. Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

work page
[54]

Break It Down: A Question Understanding Benchmark , author=

work page
[55]

Asymmetric

Shrivastava, Anshumali and Li, Ping , booktitle = nips, editor =. Asymmetric

work page
[56]

Artificial Intelligence and Statistics , pages=

Quantization based fast inner product search , author=. Artificial Intelligence and Statistics , pages=

work page
[57]

Proceedings of the 22nd international conference on Machine learning , pages=

Learning to rank using gradient descent , author=. Proceedings of the 22nd international conference on Machine learning , pages=

work page
[58]

ArXiv , year=

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering , author=. ArXiv , year=

work page
[59]

Multi-step retriever-reader interaction for scalable open-domain question answering , author=

work page
[60]

ArXiv , year=

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , author=. ArXiv , year=

work page
[61]

ArXiv , month=

Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering , author=. ArXiv , month=. 2019 , volume=

work page 2019
[62]

Retrieval-augmented generation for knowledge-intensive

Lewis, Patrick and Perez, Ethan and Piktus, Aleksandara and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K. Retrieval-augmented generation for knowledge-intensive

work page
[63]

BART : Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Veselin and Zettlemoyer, Luke. BART : Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. 2020

work page 2020
[64]

ArXiv , month=

Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , author=. ArXiv , month=. 2020 , volume=

work page 2020
[65]

ArXiv , year=

Exploring the limits of transfer learning with a unified text-to-text transformer , author=. ArXiv , year=

work page
[66]

Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, and Caiming Xiong. 2020. Learning to retrieve reasoning paths over Wikipedia graph for question answering. In International Conference on Learning Representations (ICLR)

work page 2020
[67]

Petr Baudi s and Jan S ediv \`y . 2015. Modeling of the question answering task in the yodaqa system. In International Conference of the Cross-Language Evaluation Forum for European Languages, pages 222--228. Springer

work page 2015
[68]

Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on Freebase from question-answer pairs. In Empirical Methods in Natural Language Processing (EMNLP)

work page 2013
[69]

a ckinger, and Roopak Shah. 1994. Signature verification using a `` Siamese

Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard S \"a ckinger, and Roopak Shah. 1994. Signature verification using a `` Siamese " time delay neural network. In NIPS, pages 737--744

work page 1994
[70]

Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd international conference on Machine learning, pages 89--96

work page 2005
[71]

Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to answer open-domain questions. In Association for Computational Linguistics (ACL), pages 1870--1879

work page 2017
[72]

Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, and Andrew McCallum. 2019. Multi-step retriever-reader interaction for scalable open-domain question answering. In International Conference on Learning Representations (ICLR)

work page 2019
[73]

Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391--407

work page 1990
[74]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT : Pre-training of deep bidirectional transformers for language understanding. In North American Association for Computational Linguistics (NAACL)

work page 2019
[75]

David A Ferrucci. 2012. Introduction to `` This is Watson ". IBM Journal of Research and Development, 56(3.4):1--1

work page 2012
[76]

Daniel Gillick, Sayali Kulkarni, Larry Lansing, Alessandro Presta, Jason Baldridge, Eugene Ie, and Diego Garcia-Olano. 2019. Learning dense representations for entity retrieval. In Computational Natural Language Learning (CoNLL)

work page 2019
[77]

Ruiqi Guo, Sanjiv Kumar, Krzysztof Choromanski, and David Simcha. 2016. Quantization based fast inner product search. In Artificial Intelligence and Statistics, pages 482--490

work page 2016
[78]

Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. REALM : Retrieval-augmented language model pre-training. ArXiv, abs/2002.08909

work page internal anchor Pith review Pith/arXiv arXiv 2020
[79]

Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun-hsuan Sung, L \'a szl \'o Luk \'a cs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, and Ray Kurzweil. 2017. Efficient natural language response suggestion for smart reply. ArXiv, abs/1705.00652

work page internal anchor Pith review Pith/arXiv arXiv 2017
[80]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for Web search using clickthrough data. In ACM International Conference on Information and Knowledge Management (CIKM), pages 2333--2338

work page 2013

Showing first 80 references.