AssayBench is a new gene-ranking benchmark for phenotypic CRISPR screens that shows zero-shot generalist LLMs outperform both biology-specific LLMs and trainable baselines on adjusted nDCG.
Educational and psychological measurement , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces RevCI benchmark and IMPACT multi-agent framework for evidence-level contradiction detection and graded intensity scoring in peer reviews, distilled into efficient TIDE model.
PULSE stabilizes mmWave human pose estimation by screening Doppler motion prompts before injecting them into spatial magnitude reasoning.
citing papers explorer
-
AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents
AssayBench is a new gene-ranking benchmark for phenotypic CRISPR screens that shows zero-shot generalist LLMs outperform both biology-specific LLMs and trainable baselines on adjusted nDCG.
-
When Reviews Disagree: Fine-Grained Contradiction Analysis in Scientific Peer Reviews
Introduces RevCI benchmark and IMPACT multi-agent framework for evidence-level contradiction detection and graded intensity scoring in peer reviews, distilled into efficient TIDE model.
-
Doppler Prompting for Stable mmWave-based Human Pose Estimation
PULSE stabilizes mmWave human pose estimation by screening Doppler motion prompts before injecting them into spatial magnitude reasoning.