Time: A multi-level benchmark for temporal reasoning of llms in real-world scenarios

Shaohang Wei, Wei Li, Feifan Song, Wen Luo, Tianyi Zhuang, Haochen Tan, Zhijiang Guo, Houfeng Wang · 2025 · arXiv 2505.12891

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

TIME: Temporally Intelligent Meta-reasoning Engine for Context-Triggered Explicit Reasoning

cs.LG · 2026-01-08 · unverdicted · novelty 7.0

TIME trains LLMs to trigger compact, context-triggered reasoning via time tags and tick events, improving TIMEBench scores while cutting explicit reasoning tokens by an order of magnitude.

CryptoBench: A Dynamic Benchmark for Expert-Level Evaluation of LLM Agents in Cryptocurrency

cs.CL · 2025-11-29 · unverdicted · novelty 6.0

CryptoBench is a new dynamic benchmark for LLM agents in cryptocurrency that reveals a retrieval-prediction imbalance in model performance.

citing papers explorer

Showing 2 of 2 citing papers.

TIME: Temporally Intelligent Meta-reasoning Engine for Context-Triggered Explicit Reasoning cs.LG · 2026-01-08 · unverdicted · none · ref 15
TIME trains LLMs to trigger compact, context-triggered reasoning via time tags and tick events, improving TIMEBench scores while cutting explicit reasoning tokens by an order of magnitude.
CryptoBench: A Dynamic Benchmark for Expert-Level Evaluation of LLM Agents in Cryptocurrency cs.CL · 2025-11-29 · unverdicted · none · ref 18
CryptoBench is a new dynamic benchmark for LLM agents in cryptocurrency that reveals a retrieval-prediction imbalance in model performance.

Time: A multi-level benchmark for temporal reasoning of llms in real-world scenarios

fields

years

verdicts

representative citing papers

citing papers explorer