JECS aggregates per-model conformal p-values via their maximum and reconstructs a conservative envelope of the max-p null distribution to select benchmarks with global contamination rate control.
ACM transactions on intelligent systems and technology , volume=
8 Pith papers cite this work. Polarity classification is still indexing.
years
2026 8representative citing papers
ClawForge is a generator framework that creates reproducible executable benchmarks for command-line agents under state conflict, with ClawForge-Bench showing frontier models reach at most 45.3% strict accuracy and that state inspection drives most performance gaps.
Graphlets mined as structural tokens improve zero-shot inductive and transductive link prediction in knowledge graph foundation models across 51 diverse graphs.
TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.
STK-Adapter adds Spatial-Temporal MoE, Event-Aware MoE, and Cross-Modality Alignment MoE to integrate evolving TKG graphs and event chains into LLMs, reducing information loss and improving extrapolation performance over prior methods.
citing papers explorer
-
Provable Joint Decontamination for Benchmarking Multiple Large Language Models
JECS aggregates per-model conformal p-values via their maximum and reconstructs a conservative envelope of the max-p null distribution to select benchmarks with global contamination rate control.
-
ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents
ClawForge is a generator framework that creates reproducible executable benchmarks for command-line agents under state conflict, with ClawForge-Bench showing frontier models reach at most 45.3% strict accuracy and that state inspection drives most performance gaps.
-
Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models
Graphlets mined as structural tokens improve zero-shot inductive and transductive link prediction in knowledge graph foundation models across 51 diverse graphs.
-
Temporal Aware Pruning for Efficient Diffusion-based Video Generation
TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.
-
STK-Adapter: Incorporating Evolving Graph and Event Chain for Temporal Knowledge Graph Extrapolation
STK-Adapter adds Spatial-Temporal MoE, Event-Aware MoE, and Cross-Modality Alignment MoE to integrate evolving TKG graphs and event chains into LLMs, reducing information loss and improving extrapolation performance over prior methods.
- When AI Takes Sides on Questions of Faith: Persistent Asymmetries in AI-Mediated Faith Guidance
- VIDA: A dataset for Visually Dependent Ambiguity in Multimodal Machine Translation
- Dynamics of Cognitive Heterogeneity: Investigating Behavioral Biases in Multi-Stage Supply Chains with LLM-Based Simulation