A survey on multi-turn interaction capabilities of large language models

Chen Zhang, Xinyi Dai, Yaxiong Wu, Qu Yang, Yasheng Wang, Ruiming Tang, Yong Liu · 2025 · arXiv 2501.09959

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

CRAB-Bench: Evaluating LLM Agents under Complex Task Dependencies and Human-aligned User Simulation

cs.CL · 2026-06-01 · unverdicted · novelty 7.0

CRAB-Bench and RUSE create a new evaluation framework for LLM agents on constraint-graph tasks with realistic human-like user behaviors, reporting 61% pass@1 for the best model and up to 57% further drops under RUSE.

ReCrit: Transition-Aware Reinforcement Learning for Scientific Critic Reasoning

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

ReCrit frames critic interaction as a correctness-transition problem and uses quadrant-based RL rewards to improve LLM performance on scientific reasoning benchmarks by rewarding corrections and robustness while penalizing sycophancy.

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

cs.CR · 2026-05-10 · unverdicted · novelty 6.0

MT-JailBench is a modular benchmark that standardizes evaluation of multi-turn jailbreaks to identify key success drivers and enable stronger combined attacks.

Online Adaptive Probabilistic Safety Certificate with Language Guidance

eess.SY · 2025-11-16 · unverdicted · novelty 5.0

A framework integrates user language and probabilistic environment estimates into adaptive safety certificates that guarantee long-term safety for stochastic systems via probabilistic invariance.

From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs

cs.IR · 2025-04-22 · unverdicted · novelty 5.0

The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.

A Framework for Longitudinal Health AI Agents

cs.AI · 2026-04-13 · unverdicted · novelty 4.0

Proposes a multi-layer framework and agent architecture that operationalizes adaptation, coherence, continuity, and agency for longitudinal health AI agents.

MT-OSC: Path for LLMs that Get Lost in Multi-Turn Conversation

cs.CL · 2026-04-09

citing papers explorer

Showing 1 of 1 citing paper after filters.

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks cs.CR · 2026-05-10 · unverdicted · none · ref 44
MT-JailBench is a modular benchmark that standardizes evaluation of multi-turn jailbreaks to identify key success drivers and enable stronger combined attacks.

A survey on multi-turn interaction capabilities of large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer