How easily do irrelevant inputs skew the responses of large language models?

How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?CoRRabs/2404 · 2024 · arXiv 2404.03302

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

CQC-RAG: Robust Retrieval-Augmented Generation via Cross-Query Consistency

cs.IR · 2026-06-11 · unverdicted · novelty 6.0

CQC-RAG improves RAG factuality by generating diverse equivalent queries, building query-specific contexts, and selecting answers via cross-query confidence stability, with reported gains of +4.76 pp EM on TriviaQA and +9.12 pp EM on MuSiQue.

CLORE: Content-Level Optimization for Reasoning Efficiency

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.

Seir\^enes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

Seirênes trains LLMs via adversarial self-play to generate and overcome evolving distractions, producing gains of 7-10 points on math reasoning benchmarks and exposing blind spots in larger models.

Progressive Multimodal Search and Reasoning for Knowledge-Intensive Visual Question Answering

cs.CV · 2025-08-31 · unverdicted · novelty 6.0

PMSR progressively constructs structured reasoning trajectories with dual-scope queries and compositional reasoning to improve knowledge acquisition and answer accuracy in knowledge-intensive VQA.

Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities

cs.HC · 2025-07-05 · unverdicted · novelty 6.0

A gamified system with multiple LLM agents of varied personalities gathers interaction data to produce more effective and interpretable Big Five personality assessments than single-context methods.

E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning

cs.CL · 2024-09-10 · unverdicted · novelty 5.0

E2LLM uses encoder-based soft prompt compression for long contexts to improve LLM reasoning on tasks like summarization and QA while maintaining efficiency.

Robust Audio-Text Retrieval via Cross-Modal Attention and Hybrid Loss

cs.CL · 2026-04-25 · unverdicted · novelty 4.0

A cross-modal attention refinement module plus hybrid loss improves robustness of audio-text retrieval on noisy and long-form audio.

From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap

cs.SE · 2024-10-28 · unverdicted · novelty 4.0

A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.

citing papers explorer

Showing 8 of 8 citing papers.

CQC-RAG: Robust Retrieval-Augmented Generation via Cross-Query Consistency cs.IR · 2026-06-11 · unverdicted · none · ref 28
CQC-RAG improves RAG factuality by generating diverse equivalent queries, building query-specific contexts, and selecting answers via cross-query confidence stability, with reported gains of +4.76 pp EM on TriviaQA and +9.12 pp EM on MuSiQue.
CLORE: Content-Level Optimization for Reasoning Efficiency cs.AI · 2026-05-21 · unverdicted · none · ref 48
CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.
Seir\^enes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning cs.AI · 2026-05-12 · unverdicted · none · ref 66
Seirênes trains LLMs via adversarial self-play to generate and overcome evolving distractions, producing gains of 7-10 points on math reasoning benchmarks and exposing blind spots in larger models.
Progressive Multimodal Search and Reasoning for Knowledge-Intensive Visual Question Answering cs.CV · 2025-08-31 · unverdicted · none · ref 48
PMSR progressively constructs structured reasoning trajectories with dual-scope queries and compositional reasoning to improve knowledge acquisition and answer accuracy in knowledge-intensive VQA.
Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities cs.HC · 2025-07-05 · unverdicted · none · ref 137
A gamified system with multiple LLM agents of varied personalities gathers interaction data to produce more effective and interpretable Big Five personality assessments than single-context methods.
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning cs.CL · 2024-09-10 · unverdicted · none · ref 49
E2LLM uses encoder-based soft prompt compression for long contexts to improve LLM reasoning on tasks like summarization and QA while maintaining efficiency.
Robust Audio-Text Retrieval via Cross-Modal Attention and Hybrid Loss cs.CL · 2026-04-25 · unverdicted · none · ref 8
A cross-modal attention refinement module plus hybrid loss improves robustness of audio-text retrieval on noisy and long-form audio.
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap cs.SE · 2024-10-28 · unverdicted · none · ref 112
A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.

How easily do irrelevant inputs skew the responses of large language models?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer