pith. sign in

Xi Su

Identifiers

  • name variant Xi Su 0.60 · backfill

Papers (4)

  1. LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling cs.CL · 2026 · author #6
  2. MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft cs.CL · 2026 · author #6
  3. VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions cs.AI · 2026 · author #9
  4. AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation cs.AI · 2026 · author #7

Mentions

  • 2606.12837 #6 · arxiv_oai · confidence 0.70 Xi Su
  • 2605.30931 #6 · arxiv_oai · confidence 0.70 Xi Su
  • 2605.27141 #9 · arxiv_oai · confidence 0.70 Xi Su

Frequent Coauthors