pith. sign in

Advances in Neural Information Processing Systems , year =

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 7

roles

background 1

polarities

unclear 1

clear filters

representative citing papers

Action Emergence from Streaming Intent

cs.RO · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

A new VLA model called SI uses a four-step chain-of-thought to derive driving intent and applies it via classifier-free guidance to a flow-matching trajectory generator, showing competitive Waymo scores and intent-controllable plans.

Targeted Tests for LLM Reasoning: An Audit-Constrained Protocol

cs.LG · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

Presents an audit-constrained protocol for targeted LLM reasoning evaluation using component grammar prompt variants and shows that Component-Adaptive Prompt Sampling does not outperform uniform sampling in audited yield.

Language models fail at extended rule following

cs.CL · 2026-05-03 · unverdicted · novelty 5.0

LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

citing papers explorer

Showing 6 of 6 citing papers after filters.

  • Action Emergence from Streaming Intent cs.RO · 2026-05-12 · unverdicted · none · ref 1 · 2 links

    A new VLA model called SI uses a four-step chain-of-thought to derive driving intent and applies it via classifier-free guidance to a flow-matching trajectory generator, showing competitive Waymo scores and intent-controllable plans.

  • Stress-Testing the Reasoning Competence of LLMs With Proofs Under Minimal Formalism cs.LO · 2026-04-07 · unverdicted · none · ref 92

    ProofGrid is a new benchmark for LLM reasoning that uses machine-checkable proofs in minimal formal notation, revealing progress on basic tasks but major gaps in complex combinatorial and synthesis reasoning.

  • Targeted Tests for LLM Reasoning: An Audit-Constrained Protocol cs.LG · 2026-05-12 · unverdicted · none · ref 3 · 2 links

    Presents an audit-constrained protocol for targeted LLM reasoning evaluation using component grammar prompt variants and shows that Component-Adaptive Prompt Sampling does not outperform uniform sampling in audited yield.

  • Tool Calling is Linearly Readable and Steerable in Language Models cs.CL · 2026-05-08 · unverdicted · none · ref 18

    Tool identity is linearly readable and steerable in LLMs via mean activation differences, with 77-100% switch accuracy and error prediction from activation gaps.

  • Language models fail at extended rule following cs.CL · 2026-05-03 · unverdicted · none · ref 18

    LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

  • A Case-Driven Multi-Agent Framework for E-Commerce Search Relevance cs.IR · 2026-05-07 · unverdicted · none · ref 9

    A case-driven multi-agent system automates the full pipeline of bad-case detection, annotation, and resolution for e-commerce search relevance using Annotator, Optimizer, and User agents plus supporting components.