T as T e: Teaching Large Language Models to Translate through Self-Reflection

Wang, Yutong, Zeng, Jiali, Liu, Xuebo, Meng, Fandong, Zhou, Jie, Zhang, Min · 2024 · DOI 10.18653/v1/2024.acl-long.333

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

Wait, am I Being Fair? Characterizing Deductive Stereotyping and Mitigating It with Fair-GCG

cs.CL · 2026-06-30 · unverdicted · novelty 6.0

The paper characterizes deductive stereotyping in LLMs and introduces Fair-GCG to discover injection phrases that improve fairness across benchmarks, reasoning, and real-world tasks.

Customer-Agent: Overcoming Context Limitations in Ultra-Long Shopping Trajectories via Tool-Augmented Agents and RLVR

cs.CL · 2026-06-06 · unverdicted · novelty 6.0

Introduces ShopTrajQA long-context benchmark and an RLVR-trained tool-augmented agent that bypasses LLM context limits by external file storage and code-based retrieval for shopping trajectories.

MADE: Beyond Scoring via a Multilingual Agentic Diagnosing Engine for Fine-Grained Evaluation Insights

cs.CL · 2026-06-05 · unverdicted · novelty 6.0

MADE is a new multilingual agentic diagnosing engine that produces higher-quality diagnostic reports (47% better than baseline) on a large-scale evaluation substrate covering 33 model families and 26 languages.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Wait, am I Being Fair? Characterizing Deductive Stereotyping and Mitigating It with Fair-GCG cs.CL · 2026-06-30 · unverdicted · none · ref 52
The paper characterizes deductive stereotyping in LLMs and introduces Fair-GCG to discover injection phrases that improve fairness across benchmarks, reasoning, and real-world tasks.
Customer-Agent: Overcoming Context Limitations in Ultra-Long Shopping Trajectories via Tool-Augmented Agents and RLVR cs.CL · 2026-06-06 · unverdicted · none · ref 76
Introduces ShopTrajQA long-context benchmark and an RLVR-trained tool-augmented agent that bypasses LLM context limits by external file storage and code-based retrieval for shopping trajectories.
MADE: Beyond Scoring via a Multilingual Agentic Diagnosing Engine for Fine-Grained Evaluation Insights cs.CL · 2026-06-05 · unverdicted · none · ref 11
MADE is a new multilingual agentic diagnosing engine that produces higher-quality diagnostic reports (47% better than baseline) on a large-scale evaluation substrate covering 33 model families and 26 languages.

T as T e: Teaching Large Language Models to Translate through Self-Reflection

fields

years

verdicts

representative citing papers

citing papers explorer