Does tone change the answer? evaluating prompt politeness effects on modern LLMs: GPT, Gemini, LLaMA

Hanyu Cai, Binqi Shen, Lier Jin, Lan Hu, Xiaojing Fan · 2025 · arXiv 2512.12812

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Recursive Self-Evolving Agents via Held-Out Selection

cs.AI · 2026-06-17 · unverdicted · novelty 6.0

RSEA adds a strict held-out keep-better gate to recursive self-evolution of agent artifacts, yielding monotone-safe gains or parity with the base ReAct agent on ALFWorld, GAIA, τ-bench, and WebShop.

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

cs.SE · 2026-05-30 · unverdicted · novelty 6.0

About 18.2% of structurally flagged skill pairs represent genuine compositional safety risks in agent skill registries, with exploitation gated by host model behavior.

Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance

cs.CL · 2026-05-29 · unverdicted · novelty 6.0

TOPD improves on-policy distillation for LLM reasoning by using near-future guidance to identify divergent states, raising average accuracy from 47.8% to 52.2% on math benchmarks including AIME24 and AIME25.

Unbiased Diffusion Variational Inversion via Principled Posterior Matching

cs.CV · 2026-05-24 · unverdicted · novelty 6.0 · 2 refs

PPM derives a tractable gradient for exact KL optimization in diffusion variational inversion to achieve unbiased posterior matching without heuristic approximations.

ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation

cs.RO · 2026-05-06 · unverdicted · novelty 6.0

ConsisVLA-4D adds cross-view semantic alignment, cross-object geometric fusion, and cross-scene dynamic reasoning to VLA models, delivering 21.6% and 41.5% gains plus 2.3x and 2.4x speedups on LIBERO and real-world tasks.

Toxic HallucinAItions: Perturbing Prompts and Tracing LLM Circuits

cs.CL · 2026-05-29 · unverdicted · novelty 5.0

Toxic prompt perturbations reduce LLM factual accuracy on three benchmarks and selectively amplify perturbation-sensitive nodes in attribution graphs.

Reflect-Guard: Enhancing LLM Safeguards against Adversarial Prompts via Logical Self-Reflection

cs.CR · 2026-05-24 · unverdicted · novelty 5.0

Reflect-Guard fine-tunes Llama-Guard-3-8B with distilled self-reflections to raise F1 on WildGuardTest from 0.770 to 0.842 and cut JailbreakBench attack success from 10.3% to 1.8%.

citing papers explorer

Showing 7 of 7 citing papers.

Recursive Self-Evolving Agents via Held-Out Selection cs.AI · 2026-06-17 · unverdicted · none · ref 2
RSEA adds a strict held-out keep-better gate to recursive self-evolution of agent artifacts, yielding monotone-safe gains or parity with the base ReAct agent on ALFWorld, GAIA, τ-bench, and WebShop.
When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems cs.SE · 2026-05-30 · unverdicted · none · ref 40
About 18.2% of structurally flagged skill pairs represent genuine compositional safety risks in agent skill registries, with exploitation gated by host model behavior.
Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance cs.CL · 2026-05-29 · unverdicted · none · ref 110
TOPD improves on-policy distillation for LLM reasoning by using near-future guidance to identify divergent states, raising average accuracy from 47.8% to 52.2% on math benchmarks including AIME24 and AIME25.
Unbiased Diffusion Variational Inversion via Principled Posterior Matching cs.CV · 2026-05-24 · unverdicted · none · ref 4 · 2 links
PPM derives a tractable gradient for exact KL optimization in diffusion variational inversion to achieve unbiased posterior matching without heuristic approximations.
ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation cs.RO · 2026-05-06 · unverdicted · none · ref 8
ConsisVLA-4D adds cross-view semantic alignment, cross-object geometric fusion, and cross-scene dynamic reasoning to VLA models, delivering 21.6% and 41.5% gains plus 2.3x and 2.4x speedups on LIBERO and real-world tasks.
Toxic HallucinAItions: Perturbing Prompts and Tracing LLM Circuits cs.CL · 2026-05-29 · unverdicted · none · ref 5
Toxic prompt perturbations reduce LLM factual accuracy on three benchmarks and selectively amplify perturbation-sensitive nodes in attribution graphs.
Reflect-Guard: Enhancing LLM Safeguards against Adversarial Prompts via Logical Self-Reflection cs.CR · 2026-05-24 · unverdicted · none · ref 14
Reflect-Guard fine-tunes Llama-Guard-3-8B with distilled self-reflections to raise F1 on WildGuardTest from 0.770 to 0.842 and cut JailbreakBench attack success from 10.3% to 1.8%.

Does tone change the answer? evaluating prompt politeness effects on modern LLMs: GPT, Gemini, LLaMA

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer