pith. machine review for the scientific record. sign in

What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

verdicts

UNVERDICTED 9

roles

background 1

polarities

unclear 1

representative citing papers

Deep Minds and Shallow Probes

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

Symmetry under affine reparameterizations of hidden coordinates selects a unique hierarchy of shallow coordinate-stable probes and a probe-visible quotient for cross-model transfer.

What Do EEG Foundation Models Capture from Human Brain Signals?

cs.AI · 2026-05-12 · unverdicted · novelty 7.0

EEG foundation models encode many traditional hand-crafted features like frequency power, recovering on average 79% of their advantage over random baselines on clinical tasks while leaving residuals on harder ones.

A framework for analyzing concept representations in neural models

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.

Instructions Shape Production of Language, not Processing

cs.CL · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

Instructions trigger a production-centered mechanism in language models, with task-specific information stable in input tokens but varying strongly in output tokens and correlating with behavior.

Conceptors for Semantic Steering

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Conceptors as soft projection matrices from bipolar activations offer a multidimensional, compositional, and geometrically principled method for semantic steering in LLMs that outperforms single-vector baselines in multi-dimensional subspaces.

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

cs.CL · 2022-11-09 · unverdicted · novelty 6.0

BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.

Probing Classifiers: Promises, Shortcomings, and Advances

cs.CL · 2021-02-24 · unverdicted · novelty 3.0

Probing classifiers are a common but limited method for analyzing linguistic knowledge in neural NLP models, and this review outlines their promises, methodological shortcomings, and recent advances.

citing papers explorer

Showing 9 of 9 citing papers.

  • Deep Minds and Shallow Probes cs.LG · 2026-05-12 · unverdicted · none · ref 2

    Symmetry under affine reparameterizations of hidden coordinates selects a unique hierarchy of shallow coordinate-stable probes and a probe-visible quotient for cross-model transfer.

  • What Do EEG Foundation Models Capture from Human Brain Signals? cs.AI · 2026-05-12 · unverdicted · none · ref 31

    EEG foundation models encode many traditional hand-crafted features like frequency power, recovering on average 79% of their advantage over random baselines on clinical tasks while leaving residuals on harder ones.

  • A framework for analyzing concept representations in neural models cs.CL · 2026-05-02 · unverdicted · none · ref 57

    A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.

  • Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling cs.LG · 2026-05-12 · unverdicted · none · ref 20

    A tabular foundation model with LLM-as-Observer features predicts AI agent decisions in controlled games, outperforming baselines by 4 AUC points and 14% lower error at K=16 interactions.

  • Instructions Shape Production of Language, not Processing cs.CL · 2026-05-11 · unverdicted · none · ref 11 · 2 links

    Instructions trigger a production-centered mechanism in language models, with task-specific information stable in input tokens but varying strongly in output tokens and correlating with behavior.

  • Conceptors for Semantic Steering cs.LG · 2026-05-06 · unverdicted · none · ref 4

    Conceptors as soft projection matrices from bipolar activations offer a multidimensional, compositional, and geometrically principled method for semantic steering in LLMs that outperforms single-vector baselines in multi-dimensional subspaces.

  • Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe cs.CL · 2026-05-01 · unverdicted · none · ref 11

    An encoding probe reconstructs transformer representations from acoustic, phonetic, syntactic, lexical and speaker features, showing independent syntactic/lexical contributions and training-dependent speaker effects.

  • BLOOM: A 176B-Parameter Open-Access Multilingual Language Model cs.CL · 2022-11-09 · unverdicted · none · ref 221

    BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.

  • Probing Classifiers: Promises, Shortcomings, and Advances cs.CL · 2021-02-24 · unverdicted · none · ref 19

    Probing classifiers are a common but limited method for analyzing linguistic knowledge in neural NLP models, and this review outlines their promises, methodological shortcomings, and recent advances.