Under review

Seq vs seq: An open suite of paired encoders, decoders · 2016 · arXiv 2507.11412

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

cs.CL · 2026-05-12 · conditional · novelty 7.0

A temporary CLM phase followed by MLM decay during encoder continued pretraining outperforms standard MLM on biomedical tasks by 0.3-2.8pp across languages and model sizes.

Locating acts of mechanistic reasoning in student team conversations with mechanistic machine learning

physics.ed-ph · 2026-04-23 · unverdicted · novelty 7.0

A probabilistic model with domain-aligned inductive bias detects acts of mechanistic reasoning in student conversations and shows improved generalization to unseen students and novel contexts.

MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events

cs.CL · 2026-04-16 · unverdicted · novelty 7.0

MADE creates a contamination-resistant living benchmark for multi-label classification of medical device adverse events, with evaluations revealing model-specific trade-offs in accuracy and uncertainty quantification.

Scaling Laws for Cross-Encoder Reranking

cs.IR · 2026-03-05 · unverdicted · novelty 7.0

Cross-encoder reranker performance scales predictably via power laws with model size and training exposure, allowing accurate forecasts for 400M and 1B models and data-heavy compute allocation.

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

cs.CL · 2026-05-29 · unverdicted · novelty 6.0

Modestly sized language models acquire sensitivity to the meanings of rare Paired-Focus constructions later than their syntactic forms, with semantic learning correlating to gains in selected world-knowledge domains.

MATCH: Modulating Attention via In-Context Retrieval for Long-Context Transformers

cs.CL · 2026-06-29 · unverdicted · novelty 4.0

MATCH augments sparsified attention with an efficient in-context retrieval system to boost performance on long-range recall tasks in transformers.

citing papers explorer

Showing 4 of 4 citing papers after filters.

A Causal Language Modeling Detour Improves Encoder Continued Pretraining cs.CL · 2026-05-12 · conditional · none · ref 14
A temporary CLM phase followed by MLM decay during encoder continued pretraining outperforms standard MLM on biomedical tasks by 0.3-2.8pp across languages and model sizes.
MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events cs.CL · 2026-04-16 · unverdicted · none · ref 7
MADE creates a contamination-resistant living benchmark for multi-label classification of medical device adverse events, with evaluations revealing model-specific trade-offs in accuracy and uncertainty quantification.
Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions cs.CL · 2026-05-29 · unverdicted · none · ref 4
Modestly sized language models acquire sensitivity to the meanings of rare Paired-Focus constructions later than their syntactic forms, with semantic learning correlating to gains in selected world-knowledge domains.
MATCH: Modulating Attention via In-Context Retrieval for Long-Context Transformers cs.CL · 2026-06-29 · unverdicted · none · ref 88
MATCH augments sparsified attention with an efficient in-context retrieval system to boost performance on long-range recall tasks in transformers.

Under review

fields

years

verdicts

representative citing papers

citing papers explorer