pith. sign in

Future Lens: Anticipating Subsequent Tokens from a Single Hidden State , url=

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

years

2026 6

verdicts

UNVERDICTED 6

clear filters

representative citing papers

PRISM: Recovering Instruction Sets from Language Model Activations

cs.AI · 2026-06-08 · unverdicted · novelty 7.0

PRISM is a new activation-conditioned model that recovers full sets of simultaneous instructions from LLM hidden states via judge-guided GRPO training and outperforms prior activation-to-language methods on security-relevant tasks.

A framework for analyzing concept representations in neural models

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.

The State-Prediction Separation Hypothesis

cs.CL · 2026-07-01 · unverdicted · novelty 6.0

A two-stream Transformer variant that separates state storage from next-token prediction improves validation loss and downstream task performance by 2-3 points over standard Transformers.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • A framework for analyzing concept representations in neural models cs.CL · 2026-05-02 · unverdicted · none · ref 182

    A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.

  • The State-Prediction Separation Hypothesis cs.CL · 2026-07-01 · unverdicted · none · ref 21

    A two-stream Transformer variant that separates state storage from next-token prediction improves validation loss and downstream task performance by 2-3 points over standard Transformers.

  • AERIC: Anticipatory Hidden-State Monitoring for Implicit Harmful Dialogue cs.CL · 2026-05-13 · unverdicted · none · ref 19

    AERIC uses a 387-parameter head on LLM hidden states for same-pass anticipatory detection of implicit harm, reporting AUROC gains on DiaSafety and Harmful Advice plus low-latency trigger rates on HarmBench and SocialHarmBench.