pith. sign in

One thousand and one pairs: A" novel" challenge for long-context language models.arXiv preprint arXiv:2406.16264

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 1 dataset 1

citation-polarity summary

fields

cs.CL 5 cs.LG 1

representative citing papers

Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions

cs.CL · 2025-07-07 · unverdicted · novelty 7.0

MemoryAgentBench is a new multi-turn benchmark assessing four memory competencies in LLM agents—accurate retrieval, test-time learning, long-range understanding, and selective forgetting—showing that existing methods fall short.

LLMs Get Lost In Multi-Turn Conversation

cs.CL · 2025-05-09 · unverdicted · novelty 6.0

LLMs drop 39% in performance during multi-turn conversations due to premature assumptions and inability to recover from early errors.

citing papers explorer

Showing 6 of 6 citing papers.