hub

Sentence- BERT : Sentence embeddings using S iamese BERT -networks

Nils Reimers, Iryna Gurevych · 2019 · Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) · DOI 10.18653/v1/d19-1410

63 Pith papers cite this work, alongside 7,637 external citations. Polarity classification is still indexing.

63 Pith papers citing it

7,637 external citations · Crossref

open at publisher browse 63 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

co-cited works

representative citing papers

Much of Geospatial Web Search Is Beyond Traditional GIS

cs.IR · 2026-05-11 · unverdicted · novelty 7.0

Analysis of 1.01 million unfiltered Bing queries identifies 18% as geospatial, dominated by transactional categories like costs (15.3%) that exceed traditional GIS scope.

Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

cs.CL · 2026-05-10 · unverdicted · novelty 7.0 · 2 refs

Semantic search retrieves substantially more implicit receptions of Locke's work than lexical baselines in 18th-century corpora, yet remains constrained by lexical gatekeeping.

Test-Time Personalization: A Diagnostic Framework and Probabilistic Fix for Scaling Failures

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

Test-time scaling for personalized LLMs follows a logarithmic utility curve under oracle selection but standard reward models suffer user-level collapse and query-level hacking; a probabilistic reward model with learned variance enables consistent scaling.

Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.

What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook

cs.SE · 2026-05-08 · unverdicted · novelty 7.0

AI-only technical discourse on MoltBook is coherent and organized around 12 themes led by security and trust, but it lacks the concrete code, runtime failures, and reproduction steps common in human GitHub discussions.

CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

CircuitFormer is a 511M-parameter encoder-decoder model that generates analog circuit topologies from text prompts at 100% syntactic correctness and 83% functional success using a new subcircuit-mining tokenizer that keeps vocabulary size fixed at 512.

SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents

cs.CR · 2026-05-05 · unverdicted · novelty 7.0

SkCC compiles LLM skills via SkIR to achieve portability across agent frameworks, reduce adaptation effort from O(m×n) to O(m+n), and enforce security with reported gains in task success rates and token efficiency.

Led to Mislead: Adversarial Content Injection for Attacks on Neural Ranking Models

cs.IR · 2026-05-02 · unverdicted · novelty 7.0

CRAFT is a supervised LLM framework using retrieval-augmented generation, self-refinement, fine-tuning, and preference optimization to create fluent adversarial content that boosts target ranks in neural ranking models, outperforming baselines on MS MARCO and TREC benchmarks with cross-architecture

A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

Presents MBFC-2025 dataset and multi-view embeddings with fusion methods for media bias and factuality, reporting SOTA results on ACL-2020 and new benchmarks on MBFC-2025.

InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees

cs.LG · 2026-05-01 · unverdicted · novelty 7.0 · 2 refs

InvEvolve evolves white-box inventory policies from LLMs with statistical safety guarantees and outperforms classical and deep learning methods on synthetic and real retail data.

From Chatbots to Confidants: A Cross-Cultural Study of LLM Adoption for Emotional Support

cs.CL · 2026-04-28 · unverdicted · novelty 7.0

A cross-cultural survey finds LLM emotional support adoption ranges from 20% to 59% by country, with positive perceptions strongest among higher-SES, religious, married adults aged 25-44 and in English-speaking nations.

Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

R-CAI inverts constitutional AI to automatically generate diverse toxic data for LLM red teaming, with probability clamping improving output coherence by 15% while preserving adversarial strength.

Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers

cs.IR · 2026-04-19 · unverdicted · novelty 7.0

Code-switching creates a fundamental performance bottleneck for multilingual retrievers, causing drops of up to 27% on new benchmarks CSR-L and CS-MTEB, with embedding divergence as the key cause and vocabulary expansion insufficient to fix it.

Bounded Autonomy: Controlling LLM Characters in Live Multiplayer Games

cs.HC · 2026-04-06 · unverdicted · novelty 7.0

Bounded autonomy is a new control architecture that makes LLM characters workable in live multiplayer games by combining interaction stability techniques, action grounding, and lightweight player steering, validated through deployment and analysis.

Retrieval Augmented Conversational Recommendation with Reinforcement Learning

cs.IR · 2026-04-06 · unverdicted · novelty 7.0

RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

cs.CL · 2024-02-05 · unverdicted · novelty 7.0

M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.

LIFT: Last-Mile Fine-Tuning for Table Explicitation

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

LIFT pairs a pre-trained LLM for initial table extraction with a fine-tuned SLM for error repair, matching end-to-end SLM fine-tuning on TEDS while needing only 1,000 examples and gaining robustness.

Task-Adaptive Embedding Refinement via Test-time LLM Guidance

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.

Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis

cs.CR · 2026-05-12 · unverdicted · novelty 6.0

Safety Context Injection prepends structured external risk reports via static or agentic analysis to lower attack success rates and toxicity in reasoning models on AdvBench and GPTFuzz benchmarks.

Adversarial SQL Injection Generation with LLM-Based Architectures

cs.CR · 2026-05-11 · unverdicted · novelty 6.0

RADAGAS-GPT4o achieves a 22.73% bypass rate against 10 WAFs, succeeding more against AI/ML-based firewalls than rule-based ones.

SkillRAE: Agent Skill-Based Context Compilation for Retrieval-Augmented Execution

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

SkillRAE organizes skills into a graph and compiles compact, grounded contexts for LLM agents, yielding 11.7% gains on SkillsBench over prior RAE methods.

AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators

cs.CL · 2026-05-09 · unverdicted · novelty 6.0

AgentCollabBench shows that multi-agent reliability is limited by communication topology, with converging-DAG nodes causing synthesis bottlenecks that discard constraints and explain 7-40% of information loss variance.

CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation

cs.CL · 2026-05-06 · unverdicted · novelty 6.0

CAR reranks documents in RAG by promoting those that increase generator confidence (via answer consistency sampling) and demoting those that decrease it, yielding NDCG@5 gains on BEIR datasets that correlate with F1 improvements.

The Infinite Mutation Engine? Measuring Polymorphism in LLM-Generated Offensive Code

cs.CR · 2026-05-05 · unverdicted · novelty 6.0 · 2 refs

A single commercial LLM can cheaply generate large populations of behaviorally equivalent yet structurally diverse malware payloads.

citing papers explorer

Showing 50 of 63 citing papers.

Much of Geospatial Web Search Is Beyond Traditional GIS cs.IR · 2026-05-11 · unverdicted · none · ref 22
Analysis of 1.01 million unfiltered Bing queries identifies 18% as geospatial, dominated by transactional categories like costs (15.3%) that exceed traditional GIS scope.
Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke cs.CL · 2026-05-10 · unverdicted · none · ref 35 · 2 links
Semantic search retrieves substantially more implicit receptions of Locke's work than lexical baselines in 18th-century corpora, yet remains constrained by lexical gatekeeping.
Test-Time Personalization: A Diagnostic Framework and Probabilistic Fix for Scaling Failures cs.LG · 2026-05-09 · unverdicted · none · ref 24
Test-time scaling for personalized LLMs follows a logarithmic utility curve under oracle selection but standard reward models suffer user-level collapse and query-level hacking; a probabilistic reward model with learned variance enables consistent scaling.
Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval cs.CV · 2026-05-08 · unverdicted · none · ref 51
A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.
What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook cs.SE · 2026-05-08 · unverdicted · none · ref 35
AI-only technical discourse on MoltBook is coherent and organized around 12 themes led by security and trust, but it lacks the concrete code, runtime failures, and reproduction steps common in human GitHub discussions.
CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt cs.AI · 2026-05-07 · unverdicted · none · ref 5
CircuitFormer is a 511M-parameter encoder-decoder model that generates analog circuit topologies from text prompts at 100% syntactic correctness and 83% functional success using a new subcircuit-mining tokenizer that keeps vocabulary size fixed at 512.
SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents cs.CR · 2026-05-05 · unverdicted · none · ref 34
SkCC compiles LLM skills via SkIR to achieve portability across agent frameworks, reduce adaptation effort from O(m×n) to O(m+n), and enforce security with reported gains in task success rates and token efficiency.
Led to Mislead: Adversarial Content Injection for Attacks on Neural Ranking Models cs.IR · 2026-05-02 · unverdicted · none · ref 42
CRAFT is a supervised LLM framework using retrieval-augmented generation, self-refinement, fine-tuning, and preference optimization to create fluent adversarial content that boosts target ranks in neural ranking models, outperforming baselines on MS MARCO and TREC benchmarks with cross-architecture
A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis cs.CL · 2026-05-02 · unverdicted · none · ref 212
Presents MBFC-2025 dataset and multi-view embeddings with fusion methods for media bias and factuality, reporting SOTA results on ACL-2020 and new benchmarks on MBFC-2025.
InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees cs.LG · 2026-05-01 · unverdicted · none · ref 78 · 2 links
InvEvolve evolves white-box inventory policies from LLMs with statistical safety guarantees and outperforms classical and deep learning methods on synthetic and real retail data.
From Chatbots to Confidants: A Cross-Cultural Study of LLM Adoption for Emotional Support cs.CL · 2026-04-28 · unverdicted · none · ref 36
A cross-cultural survey finds LLM emotional support adoption ranges from 20% to 59% by country, with positive perceptions strongest among higher-SES, religious, married adults aged 25-44 and in English-speaking nations.
Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF cs.CL · 2026-04-20 · unverdicted · none · ref 61
R-CAI inverts constitutional AI to automatically generate diverse toxic data for LLM red teaming, with probability clamping improving output coherence by 15% while preserving adversarial strength.
Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers cs.IR · 2026-04-19 · unverdicted · none · ref 46
Code-switching creates a fundamental performance bottleneck for multilingual retrievers, causing drops of up to 27% on new benchmarks CSR-L and CS-MTEB, with embedding divergence as the key cause and vocabulary expansion insufficient to fix it.
Bounded Autonomy: Controlling LLM Characters in Live Multiplayer Games cs.HC · 2026-04-06 · unverdicted · none · ref 16
Bounded autonomy is a new control architecture that makes LLM characters workable in live multiplayer games by combining interaction stability techniques, action grounding, and lightweight player steering, validated through deployment and analysis.
Retrieval Augmented Conversational Recommendation with Reinforcement Learning cs.IR · 2026-04-06 · unverdicted · none · ref 48
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation cs.CL · 2024-02-05 · unverdicted · none · ref 80
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
LIFT: Last-Mile Fine-Tuning for Table Explicitation cs.LG · 2026-05-13 · unverdicted · none · ref 18
LIFT pairs a pre-trained LLM for initial table extraction with a fine-tuned SLM for error repair, matching end-to-end SLM fine-tuning on TEDS while needing only 1,000 examples and gaining robustness.
Task-Adaptive Embedding Refinement via Test-time LLM Guidance cs.CL · 2026-05-12 · unverdicted · none · ref 35
Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.
Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis cs.CR · 2026-05-12 · unverdicted · none · ref 15
Safety Context Injection prepends structured external risk reports via static or agentic analysis to lower attack success rates and toxicity in reasoning models on AdvBench and GPTFuzz benchmarks.
Adversarial SQL Injection Generation with LLM-Based Architectures cs.CR · 2026-05-11 · unverdicted · none · ref 36
RADAGAS-GPT4o achieves a 22.73% bypass rate against 10 WAFs, succeeding more against AI/ML-based firewalls than rule-based ones.
SkillRAE: Agent Skill-Based Context Compilation for Retrieval-Augmented Execution cs.CL · 2026-05-11 · unverdicted · none · ref 49
SkillRAE organizes skills into a graph and compiles compact, grounded contexts for LLM agents, yielding 11.7% gains on SkillsBench over prior RAE methods.
AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators cs.CL · 2026-05-09 · unverdicted · none · ref 39
AgentCollabBench shows that multi-agent reliability is limited by communication topology, with converging-DAG nodes causing synthesis bottlenecks that discard constraints and explain 7-40% of information loss variance.
CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation cs.CL · 2026-05-06 · unverdicted · none · ref 10
CAR reranks documents in RAG by promoting those that increase generator confidence (via answer consistency sampling) and demoting those that decrease it, yielding NDCG@5 gains on BEIR datasets that correlate with F1 improvements.
The Infinite Mutation Engine? Measuring Polymorphism in LLM-Generated Offensive Code cs.CR · 2026-05-05 · unverdicted · none · ref 61 · 2 links
A single commercial LLM can cheaply generate large populations of behaviorally equivalent yet structurally diverse malware payloads.
LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference cs.LG · 2026-05-01 · unverdicted · none · ref 1
LEAP adds a layer-wise exit-aware constraint to standard distillation, reconciling it with early-exit mechanisms and delivering 1.61x wall-clock speedup on MiniLM at 0.95 threshold with 91.9% early exits by layer 7.
Kernel Affine Hull Machines for Compute-Efficient Query-Side Semantic Encoding cs.LG · 2026-05-01 · unverdicted · none · ref 35
Kernel Affine Hull Machines map lexical features to semantic embeddings via RKHS and least-mean-squares, outperforming adapters in reconstruction and retrieval metrics while reducing latency 8.5-fold on a legal benchmark.
Agentic AI for Substance Use Education: Integrating Regulatory and Scientific Knowledge Sources cs.CL · 2026-05-01 · conditional · none · ref 51
The authors built and expert-evaluated an agentic AI system integrating DEA regulatory data with dynamic scientific literature via RAG to provide accurate, context-sensitive substance use education, with mean Likert ratings of 4.18-4.35 and substantial rater agreement.
Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models cs.IR · 2026-04-27 · conditional · none · ref 37
RouteHead trains a lightweight router to dynamically select optimal LLM attention heads per query for improved attention-based document re-ranking.
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills cs.CL · 2026-04-27 · unverdicted · none · ref 20
SSL representation disentangles skill scheduling, structure, and logic using an LLM normalizer, improving skill discovery MRR@50 from 0.649 to 0.729 and risk assessment macro F1 from 0.409 to 0.509 over text baselines.
Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks cs.LG · 2026-04-23 · unverdicted · none · ref 40
TEmBed benchmark shows that the best tabular embedding model depends on the specific task and the representation level (cell, row, column, or table).
Reducing Maintenance Burden in Behaviour-Driven Development: A Paraphrase-Robust Duplicate-Step Detector with a 1.1M-Step Open Benchmark cs.SE · 2026-04-22 · unverdicted · none · ref 23
A paraphrase-robust duplicate-step detector for Gherkin BDD suites, built on a new 1.1M-step public corpus, reports F1 scores up to 0.906 and estimates 893k eliminable step occurrences corpus-wide.
SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization cs.CL · 2026-04-21 · unverdicted · none · ref 31
SCURank ranks multiple summary candidates with Summary Content Units to outperform ROUGE and LLM-based methods in summarization distillation.
All Public Voices Are Equal, But Are Some More Equal Than Others to LLMs? cs.CY · 2026-04-19 · unverdicted · none · ref 78
LLMs produce lower-fidelity summaries of identical public comments when attributed to lower-status occupations like street vendors versus financial analysts, with inconsistent race effects and no gender effects.
A Case Study on the Impact of Anonymization Along the RAG Pipeline cs.CR · 2026-04-17 · unverdicted · none · ref 32
Anonymization placement in RAG—at the dataset or at the generated answer—creates observable differences in privacy protection versus response utility.
MetFuse: Figurative Fusion between Metonymy and Metaphor cs.CL · 2026-04-14 · unverdicted · none · ref 35
MetFuse provides the first dataset of 1,000 meaning-aligned quadruplets fusing literal, metonymic, metaphoric, and hybrid sentences, which augments training to boost metonymy and metaphor classification performance on benchmarks.
LAG-XAI: A Lie-Inspired Affine Geometric Framework for Interpretable Paraphrasing in Transformer Latent Spaces cs.CL · 2026-04-07 · unverdicted · none · ref 2
LAG-XAI treats paraphrasing as affine flows in semantic manifolds using Lie-inspired approximations, achieving AUC 0.7713 on paraphrase detection and 95.3% hallucination detection on HaluEval.
Does Pass Rate Tell the Whole Story? Evaluating Design Constraint Compliance in LLM-based Issue Resolution cs.SE · 2026-04-07 · unverdicted · none · ref 29
LLM agents resolve fewer than half of issues while satisfying design constraints despite passing tests, as shown by a benchmark of 495 issues and 1787 constraints from six repositories.
SemLink: A Semantic-Aware Automated Test Oracle for Hyperlink Verification using Siamese Sentence-BERT cs.SE · 2026-04-07 · unverdicted · none · ref 10
SemLink applies a Siamese SBERT model to detect semantic drift in hyperlinks, achieving 96% recall at 47.5 times the speed of GPT-5.2 using a new 60k-pair dataset.
DQA: Diagnostic Question Answering for IT Support cs.CL · 2026-04-07 · unverdicted · none · ref 10
DQA maintains persistent diagnostic state and aggregates retrievals at the root-cause level to reach 78.7% success on 150 enterprise IT scenarios versus 41.3% for standard multi-turn RAG while cutting average turns from 8.4 to 3.9.
Align then Train: Efficient Retrieval Adapter Learning cs.IR · 2026-04-03 · unverdicted · none · ref 17
A two-stage adapter method aligns query and document embedding spaces to improve dense retrieval for complex queries using lightweight encoders and few labels.
StarCoder 2 and The Stack v2: The Next Generation cs.SE · 2024-02-29 · accept · none · ref 260
StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs cs.LG · 2024-02-22 · conditional · none · ref 104
REINFORCE-style variants outperform PPO, DPO, and RAFT in RLHF for LLMs by removing unnecessary PPO components and adapting the simpler method to LLM alignment characteristics.
Context Convergence Improves Answering Inferential Questions cs.CL · 2026-05-12 · unverdicted · none · ref 30
Passages made from high-convergence sentences improve LLM performance on inferential questions compared to cosine similarity selection.
Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings cs.CL · 2026-05-11 · unverdicted · none · ref 18
Embeddings reliably capture authorial stylistic features in French literary texts, and these signals persist after LLM rewriting while showing model-specific patterns.
Exploring the Limits of Pruning: Task-Specific Neurons, Model Collapse, and Recovery in Task-Specific Large Language Models cs.CL · 2026-04-29 · unverdicted · none · ref 12
Selective pruning of low-activation neurons in task-specific LLMs preserves accuracy better than random pruning, but removing roughly 10% of highly selective neurons triggers total collapse, with fine-tuning recovering much of the lost performance.
Self-Awareness before Action: Mitigating Logical Inertia via Proactive Cognitive Awareness cs.AI · 2026-04-22 · unverdicted · none · ref 43
SABA improves LLM performance on detective puzzle benchmarks by recursively fusing information into a base state and using queries to resolve missing premises before concluding.
Cross-Model Consistency of AI-Generated Exercise Prescriptions: A Repeated Generation Study Across Three Large Language Models cs.CL · 2026-04-21 · conditional · none · ref 37
Three LLMs exhibit distinct consistency profiles in repeated exercise prescription generation, with GPT-4.1 producing unique but semantically stable outputs while Gemini 2.5 Flash achieves high similarity through text duplication.
Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps cs.CL · 2026-04-21 · unverdicted · none · ref 23
Four attention metrics enable logistic regression classifiers that detect hallucinations in SpeechLLMs with up to +0.23 PR-AUC gains over baselines on ASR and translation tasks.
Towards Scalable Lifelong Knowledge Editing with Selective Knowledge Suppression cs.AI · 2026-04-21 · unverdicted · none · ref 158
LightEdit enables scalable lifelong knowledge editing in LLMs via selective knowledge retrieval and probability suppression during decoding, outperforming prior methods on ZSRE, Counterfact, and RIPE while reducing training costs.
Semantic Entanglement in Vector-Based Retrieval: A Formal Framework and Context-Conditioned Disentanglement Pipeline for Agentic RAG Systems cs.AI · 2026-04-20 · unverdicted · none · ref 11
The paper introduces a measure of semantic entanglement in embeddings and a pipeline that improves Top-K retrieval precision from 32% to 82% on a healthcare knowledge base.

Sentence- BERT : Sentence embeddings using S iamese BERT -networks

hub tools

co-cited works

fields

years

verdicts

representative citing papers

citing papers explorer