TIME trains LLMs to trigger compact, context-triggered reasoning via time tags and tick events, improving TIMEBench scores while cutting explicit reasoning tokens by an order of magnitude.
Time: A multi-level benchmark for temporal reasoning of llms in real-world scenarios
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
CryptoBench is a new dynamic benchmark for LLM agents in cryptocurrency that reveals a retrieval-prediction imbalance in model performance.
citing papers explorer
-
TIME: Temporally Intelligent Meta-reasoning Engine for Context-Triggered Explicit Reasoning
TIME trains LLMs to trigger compact, context-triggered reasoning via time tags and tick events, improving TIMEBench scores while cutting explicit reasoning tokens by an order of magnitude.
-
CryptoBench: A Dynamic Benchmark for Expert-Level Evaluation of LLM Agents in Cryptocurrency
CryptoBench is a new dynamic benchmark for LLM agents in cryptocurrency that reveals a retrieval-prediction imbalance in model performance.