International Conference on Learning Representations , year =

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

NARRA-Gym for Evaluating Interactive Narrative Agents

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

NARRA-Gym is an executable benchmark that generates complete interactive narrative episodes from emotional seeds and logs full model trajectories to expose gaps in coherence, adaptation, and personalization that static story tests miss.

Unsolvability Ceiling in Multi-LLM Routing: An Empirical Study of Evaluation Artifacts

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Evaluation artifacts substantially inflate the measured unsolvability ceiling in multi-LLM routing, leading to distorted router training and overstated headroom.

PRISM: Preference-Aware Influence Function Based Data Selection Method for Efficient Fine-Tuning

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

PRISM weights target examples by model preference to build an improved direction for influence-based data selection in LLM fine-tuning.

NeuroMAS: Multi-Agent Systems as Neural Networks with Joint Reinforcement Learning

cs.AI · 2026-05-16 · unverdicted · novelty 6.0

NeuroMAS reframes multi-agent language systems as neural architectures where LLM agents learn coordination via reinforcement learning rather than predefined roles.

The Generalized Turing Test: A Foundation for Comparing Intelligence

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

The Generalized Turing Test defines relative intelligence as the inability of one agent to distinguish an imitator from the original through interaction.

Language models fail at extended rule following

cs.CL · 2026-05-03 · unverdicted · novelty 5.0

LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

XekRung Technical Report

cs.CR · 2026-04-30 · unverdicted · novelty 3.0

XekRung achieves state-of-the-art performance on cybersecurity benchmarks among same-scale models via tailored data synthesis and multi-stage training while retaining strong general capabilities.

citing papers explorer

Showing 2 of 2 citing papers after filters.

NARRA-Gym for Evaluating Interactive Narrative Agents cs.CL · 2026-05-08 · unverdicted · none · ref 1
NARRA-Gym is an executable benchmark that generates complete interactive narrative episodes from emotional seeds and logs full model trajectories to expose gaps in coherence, adaptation, and personalization that static story tests miss.
Language models fail at extended rule following cs.CL · 2026-05-03 · unverdicted · none · ref 2
LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

International Conference on Learning Representations , year =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer