arXiv preprint arXiv:1704.00051 , year=

Reading Wikipedia to Answer Open-Domain Questions , author= · 2017 · arXiv 1704.00051

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Online Learning-to-Defer with Varying Experts

stat.ML · 2026-05-12 · unverdicted · novelty 8.0

Presents the first online learning-to-defer algorithm with regret bounds O((n + n_e) T^{2/3}) generally and O((n + n_e) sqrt(T)) under low noise for multiclass classification with varying experts.

Passage Re-ranking with BERT

cs.IR · 2019-01-13 · unverdicted · novelty 8.0

Fine-tuning BERT for query-passage relevance classification achieves state-of-the-art results on TREC-CAR and MS MARCO, with a 27% relative gain in MRR@10 over prior methods.

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts

cs.CL · 2026-04-09 · conditional · novelty 6.0

Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.

TIDE: Every Layer Knows the Token Beneath the Context

cs.CL · 2026-05-07 · unverdicted · novelty 5.0

TIDE augments standard transformers with per-layer token embedding injection via an ensemble of memory blocks and a depth-conditioned router to mitigate rare-token undertraining and contextual collapse.

Plasma GraphRAG: Physics-Grounded Parameter Selection for Gyrokinetic Simulations

physics.plasm-ph · 2026-04-07 · unverdicted · novelty 5.0

Plasma GraphRAG automates physics-grounded parameter selection for gyrokinetic simulations via a domain-specific knowledge graph and LLMs, reporting over 10% better quality and up to 25% fewer hallucinations than standard RAG.

citing papers explorer

Showing 5 of 5 citing papers.

Online Learning-to-Defer with Varying Experts stat.ML · 2026-05-12 · unverdicted · none · ref 79
Presents the first online learning-to-defer algorithm with regret bounds O((n + n_e) T^{2/3}) generally and O((n + n_e) sqrt(T)) under low noise for multiclass classification with varying experts.
Passage Re-ranking with BERT cs.IR · 2019-01-13 · unverdicted · none · ref 1
Fine-tuning BERT for query-passage relevance classification achieves state-of-the-art results on TREC-CAR and MS MARCO, with a 27% relative gain in MRR@10 over prior methods.
Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts cs.CL · 2026-04-09 · conditional · none · ref 16
Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.
TIDE: Every Layer Knows the Token Beneath the Context cs.CL · 2026-05-07 · unverdicted · none · ref 101
TIDE augments standard transformers with per-layer token embedding injection via an ensemble of memory blocks and a depth-conditioned router to mitigate rare-token undertraining and contextual collapse.
Plasma GraphRAG: Physics-Grounded Parameter Selection for Gyrokinetic Simulations physics.plasm-ph · 2026-04-07 · unverdicted · none · ref 16
Plasma GraphRAG automates physics-grounded parameter selection for gyrokinetic simulations via a domain-specific knowledge graph and LLMs, reporting over 10% better quality and up to 25% fewer hallucinations than standard RAG.

arXiv preprint arXiv:1704.00051 , year=

fields

years

verdicts

representative citing papers

citing papers explorer