Late chunking: Contextual chunk embeddings using long-context embedding models

· 2025 · arXiv 2409.04701

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

IdioLink: Retrieving Meaning Beyond Words Across Idiomatic and Literal Expressions

cs.CL · 2026-05-21 · unverdicted · novelty 7.0

IdioLink introduces a benchmark dataset and evaluation showing that strong embedding models struggle to retrieve equivalent meanings across idiomatic and literal forms, relying on shallow cues instead.

Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval

cs.CV · 2026-04-11 · unverdicted · novelty 7.0

ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.

SPIRE: Structure-Preserving Interpretable Retrieval of Evidence

cs.IR · 2026-02-12 · unverdicted · novelty 7.0

SPIRE presents a tree-structured retrieval method using subdocuments, paths, and dual contextualization that produces higher-quality and more diverse citations than passage-based baselines on HTML QA benchmarks.

Should We Still Pretrain Encoders with Masked Language Modeling?

cs.CL · 2025-07-01 · accept · novelty 6.0

Controlled ablations of 38 models find MLM superior to CLM on representation benchmarks while CLM offers better data efficiency and stability; a biphasic CLM-then-MLM schedule is optimal under fixed compute and improves when initialized from pretrained CLM models.

EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

cs.CL · 2026-06-05 · unverdicted · novelty 5.0

EASE-TTT creates a soft attention target from evidence chunks to guide query-side test-time adaptation, yielding higher macro-average scores than full-context, retrieval-only, and standard qTTT baselines on six LongBench QA tasks.

Chunking Methods on Retrieval-Augmented Generation - Effectiveness Evaluation Against Computational Cost and Limitations

cs.CL · 2026-05-30 · unverdicted · novelty 5.0

Empirical study claiming to be the first broad comparison of chunking methods in RAG, highlighting effectiveness, cost, and generalization limitations across scenarios.

Efficient RAG with Intent-Aware Retrieval and Semantics-Preserving Chunking

cs.CL · 2026-05-31 · unverdicted · novelty 4.0

InSemRAG combines dynamic intent-aware hybrid retrieval and semantics-preserving chunk repair in an iterative loop, yielding 2.65 F1 gain on HotPotQA and 1.5 accuracy gain on FEVER with 4.32x lower latency than Multi-Hop RAG via SLMs.

Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering

cs.CL · 2026-04-27 · unverdicted · novelty 4.0

Entity-based chunk filtering reduces RAG vector index size by 25-36% with retrieval quality near baseline levels.

Evaluating Chunking Strategies for Retrieval-Augmented Generation on Academic Texts

cs.IR · 2026-07-02 · unverdicted · novelty 3.0 · 2 refs

Cluster-based semantic chunking does not outperform fixed-size or recursive chunking for RAG on academic theses, and RAGAs faithfulness shows limited reliability in this setup.

Qwen Goes Brrr: Off-the-Shelf RAG for Ukrainian Multi-Domain Document Understanding

cs.CL · 2026-05-11 · unverdicted · novelty 3.0

A RAG pipeline with contextual PDF chunking, question-and-answer-aware retrieval and reranking using Qwen3 models reaches 0.96 accuracy on a Ukrainian multi-domain document QA shared task.

citing papers explorer

Showing 10 of 10 citing papers.

IdioLink: Retrieving Meaning Beyond Words Across Idiomatic and Literal Expressions cs.CL · 2026-05-21 · unverdicted · none · ref 15
IdioLink introduces a benchmark dataset and evaluation showing that strong embedding models struggle to retrieve equivalent meanings across idiomatic and literal forms, relying on shallow cues instead.
Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval cs.CV · 2026-04-11 · unverdicted · none · ref 8
ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.
SPIRE: Structure-Preserving Interpretable Retrieval of Evidence cs.IR · 2026-02-12 · unverdicted · none · ref 10
SPIRE presents a tree-structured retrieval method using subdocuments, paths, and dual contextualization that produces higher-quality and more diverse citations than passage-based baselines on HTML QA benchmarks.
Should We Still Pretrain Encoders with Masked Language Modeling? cs.CL · 2025-07-01 · accept · none · ref 14
Controlled ablations of 38 models find MLM superior to CLM on representation benchmarks while CLM offers better data efficiency and stability; a biphasic CLM-then-MLM schedule is optimal under fixed compute and improves when initialized from pretrained CLM models.
EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering cs.CL · 2026-06-05 · unverdicted · none · ref 42
EASE-TTT creates a soft attention target from evidence chunks to guide query-side test-time adaptation, yielding higher macro-average scores than full-context, retrieval-only, and standard qTTT baselines on six LongBench QA tasks.
Chunking Methods on Retrieval-Augmented Generation - Effectiveness Evaluation Against Computational Cost and Limitations cs.CL · 2026-05-30 · unverdicted · none · ref 8
Empirical study claiming to be the first broad comparison of chunking methods in RAG, highlighting effectiveness, cost, and generalization limitations across scenarios.
Efficient RAG with Intent-Aware Retrieval and Semantics-Preserving Chunking cs.CL · 2026-05-31 · unverdicted · none · ref 21
InSemRAG combines dynamic intent-aware hybrid retrieval and semantics-preserving chunk repair in an iterative loop, yielding 2.65 F1 gain on HotPotQA and 1.5 accuracy gain on FEVER with 4.32x lower latency than Multi-Hop RAG via SLMs.
Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering cs.CL · 2026-04-27 · unverdicted · none · ref 17
Entity-based chunk filtering reduces RAG vector index size by 25-36% with retrieval quality near baseline levels.
Evaluating Chunking Strategies for Retrieval-Augmented Generation on Academic Texts cs.IR · 2026-07-02 · unverdicted · none · ref 27 · 2 links
Cluster-based semantic chunking does not outperform fixed-size or recursive chunking for RAG on academic theses, and RAGAs faithfulness shows limited reliability in this setup.
Qwen Goes Brrr: Off-the-Shelf RAG for Ukrainian Multi-Domain Document Understanding cs.CL · 2026-05-11 · unverdicted · none · ref 42
A RAG pipeline with contextual PDF chunking, question-and-answer-aware retrieval and reranking using Qwen3 models reaches 0.96 accuracy on a Ukrainian multi-domain document QA shared task.

Late chunking: Contextual chunk embeddings using long-context embedding models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer