Late chunking: Contextual chunk embeddings using long-context embedding models

Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models , author= · 2024 · arXiv 2409.04701

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Lost in a Single Vector: Improving Long-Document Retrieval with Chunk Evidence Aggregation

cs.CL · 2026-06-17 · unverdicted · novelty 7.0

DICE aggregates independently encoded document chunks into a single vector to reduce evidence dilution in long-document dense retrieval, reporting gains on LongEmbed especially beyond 4k tokens.

IdioLink: Retrieving Meaning Beyond Words Across Idiomatic and Literal Expressions

cs.CL · 2026-05-21 · unverdicted · novelty 7.0

IdioLink introduces a benchmark dataset and evaluation showing that strong embedding models struggle to retrieve equivalent meanings across idiomatic and literal forms, relying on shallow cues instead.

Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval

cs.CV · 2026-04-11 · unverdicted · novelty 7.0

ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.

SPIRE: Structure-Preserving Interpretable Retrieval of Evidence

cs.IR · 2026-02-12 · unverdicted · novelty 7.0

SPIRE presents a tree-structured retrieval method using subdocuments, paths, and dual contextualization that produces higher-quality and more diverse citations than passage-based baselines on HTML QA benchmarks.

Should We Still Pretrain Encoders with Masked Language Modeling?

cs.CL · 2025-07-01 · accept · novelty 6.0

Controlled ablations of 38 models find MLM superior to CLM on representation benchmarks while CLM offers better data efficiency and stability; a biphasic CLM-then-MLM schedule is optimal under fixed compute and improves when initialized from pretrained CLM models.

Improving Long-Context Retrieval with Multi-Prefix Embedding

cs.IR · 2026-06-22 · unverdicted · novelty 5.0

Multi-Prefix Embedding extracts per-chunk embeddings from a single forward pass over EOS-separated document chunks and matches via MaxSim while training only on document-level labels.

EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

cs.CL · 2026-06-05 · unverdicted · novelty 5.0

EASE-TTT creates a soft attention target from evidence chunks to guide query-side test-time adaptation, yielding higher macro-average scores than full-context, retrieval-only, and standard qTTT baselines on six LongBench QA tasks.

Chunking Methods on Retrieval-Augmented Generation - Effectiveness Evaluation Against Computational Cost and Limitations

cs.CL · 2026-05-30 · unverdicted · novelty 5.0

Empirical study claiming to be the first broad comparison of chunking methods in RAG, highlighting effectiveness, cost, and generalization limitations across scenarios.

Efficient RAG with Intent-Aware Retrieval and Semantics-Preserving Chunking

cs.CL · 2026-05-31 · unverdicted · novelty 4.0

InSemRAG combines dynamic intent-aware hybrid retrieval and semantics-preserving chunk repair in an iterative loop, yielding 2.65 F1 gain on HotPotQA and 1.5 accuracy gain on FEVER with 4.32x lower latency than Multi-Hop RAG via SLMs.

Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering

cs.CL · 2026-04-27 · unverdicted · novelty 4.0

Entity-based chunk filtering reduces RAG vector index size by 25-36% with retrieval quality near baseline levels.

Evaluating Chunking Strategies for Retrieval-Augmented Generation on Academic Texts

cs.IR · 2026-07-02 · unverdicted · novelty 3.0 · 2 refs

Cluster-based semantic chunking does not outperform fixed-size or recursive chunking for RAG on academic theses, and RAGAs faithfulness shows limited reliability in this setup.

Qwen Goes Brrr: Off-the-Shelf RAG for Ukrainian Multi-Domain Document Understanding

cs.CL · 2026-05-11 · unverdicted · novelty 3.0

A RAG pipeline with contextual PDF chunking, question-and-answer-aware retrieval and reranking using Qwen3 models reaches 0.96 accuracy on a Ukrainian multi-domain document QA shared task.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Late chunking: Contextual chunk embeddings using long-context embedding models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer