LoFa is a new benchmark and LFR@k metric for measuring LLM resistance to sustained logical fallacy attacks via generated question-argument pairs and debate simulations.
The Earth is Flat because...: Investigating LLM s' Belief towards Misinformation via Persuasive Conversation
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
MACR adaptively assesses LLM confidence via semantic entropy then applies inductive multi-agent reasoning with rule-induction, conflict-analysis, and resolution agents to handle unreliable parametric and contextual knowledge.
Introduces BeliefTrack benchmark diagnosing three CBM failures in LLMs and shows RL with belief-state rewards cuts failure rates by 70.9% while representation steering cuts them by 46.1%.
citing papers explorer
-
Navigating Unreliable Parametric and Contextual Knowledge: Explicit Knowledge Conflict Resolution for LLM Inference
MACR adaptively assesses LLM confidence via semantic entropy then applies inductive multi-agent reasoning with rule-induction, conflict-analysis, and resolution agents to handle unreliable parametric and contextual knowledge.
-
When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
Introduces BeliefTrack benchmark diagnosing three CBM failures in LLMs and shows RL with belief-state rewards cuts failure rates by 70.9% while representation steering cuts them by 46.1%.