pith. sign in
Pith Number

pith:I452AMLS

pith:2023:I452AMLSZVTA4RBYR335OPRJRE
not attested not anchored not stored refs resolved

Linear Representations of Sentiment in Large Language Models

Atticus Geiger, Curt Tigges, Neel Nanda, Oskar John Hollinsworth

Sentiment in large language models is captured by one direction in activation space, with positive and negative at opposite poles.

arxiv:2310.15154 v1 · 2023-10-23 · cs.LG · cs.AI · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{I452AMLSZVTA4RBYR335OPRJRE}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

sentiment is represented linearly: a single direction in activation space mostly captures the feature across a range of tasks with one extreme for positive and the other for negative.

C2weakest assumption

That the identified direction is the primary and stable representation of sentiment rather than one of several correlated directions that happen to align on the chosen datasets and models.

C3one line summary

Sentiment is represented as a single linear direction in LLM activation space that is causally relevant across tasks and is summarized at punctuation and names in addition to charged words.

References

122 extracted · 122 resolved · 1 Pith anchors

[1] Eliciting latent knowledge: How to tell if your eyes deceive you , author=. 2021 , month= 2021
[2] Karl Pearson F.R.S. , title =. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science , volume =. 1901 , publisher = 1901
[3] Information Theory, IEEE Transactions on , volume=
[4] Journal of the Royal Statistical Society: Series B (Methodological) , volume= 1958
[5] Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations , author=. 2023 , eprint= 2023

Formal links

2 machine-checked theorem links

Cited by

23 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:52.529178Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

473ba03172cd660e44388ef7d73e29892ff73fd0832dd623106b5271a5755b36

Aliases

arxiv: 2310.15154 · arxiv_version: 2310.15154v1 · doi: 10.48550/arxiv.2310.15154 · pith_short_12: I452AMLSZVTA · pith_short_16: I452AMLSZVTA4RBY · pith_short_8: I452AMLS
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/I452AMLSZVTA4RBYR335OPRJRE \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 473ba03172cd660e44388ef7d73e29892ff73fd0832dd623106b5271a5755b36
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "ad633a244a79a763073ea1df51bc22044cae72f9b828905346cab0a22cbf0868",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2023-10-23T17:55:31Z",
    "title_canon_sha256": "021e9c596b91e444484c14a187a7d3dc84869cdd4b13f68ee4532b1dffe16db7"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2310.15154",
    "kind": "arxiv",
    "version": 1
  }
}