pith. machine review for the scientific record. sign in

arxiv: 2604.19052 · v1 · submitted 2026-04-21 · 💻 cs.CL

Recognition: unknown

Cell-Based Representation of Relational Binding in Language Models

Authors on Pith no claims yet

Pith reviewed 2026-05-10 02:16 UTC · model grok-4.3

classification 💻 cs.CL
keywords relational bindingcell-based representationlarge language modelsactivation patchingentity trackingdiscourse understandinglinear subspacepartial least squares
0
0 comments X

The pith

Large language models bind entities to relations by retrieving attributes from cells in a low-dimensional activation subspace.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how large language models keep track of which attributes belong to which entities and relations across sentences in a discourse. It proposes that models store this information in a cell-based binding representation consisting of a linear subspace where each cell is assigned to a specific entity-relation index pair. The authors locate this subspace by decoding the indices from activations at attribute tokens using partial least squares regression, observe a consistent grid geometry, and demonstrate cross-context transfer through translation vectors. Activation patching experiments then show that altering the subspace changes relational predictions and that disrupting it harms performance. If this account holds, it supplies a mechanistic explanation for how models achieve relational coherence without separate memory components.

Core claim

LLMs encode discourse-level relational binding via a Cell-based Binding Representation (CBR): a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair, and bound attributes are retrieved from the corresponding cell during inference. Using controlled multi-sentence data with entity and relation indices, the subspace is identified by decoding these indices from attribute-token activations with Partial Least Squares regression. The indices form a grid-like geometry in the projected space, and context-specific CBR representations are related by translation vectors in activation space. Activation patching shows that manipulating this subspace changes the (

What carries the argument

Cell-based Binding Representation (CBR): a low-dimensional linear subspace of model activations divided into cells, each holding information for one entity-relation index pair from which attributes are retrieved.

If this is right

  • Entity and relation indices remain linearly decodable from activations inside the identified subspace across domains and model families.
  • The subspace exhibits a stable grid-like geometry that supports consistent binding.
  • Representations of the same bindings shift between contexts by fixed translation vectors in activation space.
  • Targeted perturbation of the subspace produces predictable changes in the model's relational outputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the cell structure is general, models could improve handling of long discourses simply by allocating more dimensions to this subspace.
  • The translation vectors point to a possible route for transferring learned bindings to new contexts without retraining the entire model.
  • Interventions that edit or enlarge this subspace might be used to debug or strengthen relational reasoning in existing systems.

Load-bearing premise

That linear decodability of entity-relation indices from the subspace plus the effects of activation patching show the model actually uses this cell structure for binding in ordinary inference rather than the subspace being a correlated side effect of the controlled experimental data.

What would settle it

Failure to decode the entity-relation indices or absence of any change in relational predictions when the subspace is patched on a new collection of natural, unscripted discourses would falsify the claim.

Figures

Figures reproduced from arXiv: 2604.19052 by Benjamin Heinzerling, Kentaro Inui, Qin Dai.

Figure 1
Figure 1. Figure 1: Overview of our Cell-based Binding Representation (CBR): (a) discourse annotated with entity and [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Top: PLS goodness-of-fit when predicting entity and relation indices rei, ris from Llama3-8B-Instruct attribute activations across three domains. For comparison, we also fit a Principal Component Analysis regression (PCA), Independent Component Analysis (ICA) regression and include random-label controls. Bottom: Visualiza￾tion of attribute activations projected onto the top two PLS components, showing a gr… view at source ↗
Figure 3
Figure 3. Figure 3: Layer-wise and component-wise analysis of [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Logit landscape of attribute predictions re [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Activation patching via Relation-index (i.e., [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Cosine similarity heatmaps of attribute repre [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Each cell shows the R2 fitness score ob￾tained from Llama3-8B-Instruct. The projection matrix learned from one context (column) is used to predict the index information of another context (row). c2 (e.g., country) with those from context c1 (e.g., relation). To evaluate this, we compute cross-context fit￾ness scores. For each source (or trained) context, we apply its WCBR to the activations from a dif￾fere… view at source ↗
Figure 9
Figure 9. Figure 9: (a) Samples of Ablated and Shuffled Dataset from [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Decoding performance of rei, ris from activations of Llama3-8B-Instruct using linear subspace methods. Each subplot shows how accurately entity and relation indices can be predicted as the dimensionality of the projected subspace increases for a given discourse context. The Y-axis indicates the fitness score (R2 ), and the X-axis shows the number of components used for PLS, ICA and PCA projections. “(Rand… view at source ↗
Figure 11
Figure 11. Figure 11: Decoding performance of rei, ris from activations of Qwen3-8B using linear subspace methods. Qwen3- 8B shows consistency with Llama3-8B-Instruct, indicating that the CBR subspace emerges across model families and may reflect a general property of activation in LLMs [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Visualization of the CBR subspace from Llama3-8B-Instruct. Each point represents the projected [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Visualization of the CBR subspace from Qwen3-8B. [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Decoding layer-wise performance of rei, ris from activations of Llama3-8B-Instruct using PLS. Perfor￾mance peaks in the middle layers, while both lower and higher layers show reduced fitness. 1 2 3 4 5 6 7 8 9 10 Number of Components 0.0 0.2 0.4 0.6 0.8 Fitn e s s (R 2 ) (a) Ccountry 1 2 3 4 5 6 7 8 9 10 Number of Components 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Fitn e s s (R 2 ) (b) Crelation 1 2 3 4 5 6 7 8 9… view at source ↗
Figure 15
Figure 15. Figure 15: Decoding layer-wise performance of rei, ris from activations of Llama3-8B-Instruct using PCA regres￾sion [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Decoding layer-wise performance of rei, ris from activations of Qwen3-8B using PLS. 1 2 3 4 5 6 7 8 9 10 Number of Components 0.0 0.2 0.4 0.6 0.8 Fitn e s s (R 2 ) (a) Ccountry 1 2 3 4 5 6 7 8 9 10 Number of Components 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Fitn e s s (R 2 ) (b) Crelation 1 2 3 4 5 6 7 8 9 10 Number of Components 0.0 0.2 0.4 0.6 0.8 Fitn e s s (R 2 ) (c) Ccity 1 2 3 4 5 6 7 8 9 10 Number of Comp… view at source ↗
Figure 17
Figure 17. Figure 17: Decoding layer-wise performance of rei, ris from activations of Qwen3-8B using PCA regression [PITH_FULL_IMAGE:figures/full_fig_p021_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Decoding performance of rei, ris from activations of Llama3-8B-Instruct under different monotonic index sequences. “original” denotes the original integer index. 1 2 3 4 5 6 7 8 9 10 Number of Components 0.5 0.6 0.7 0.8 0.9 1.0 Fitness (R 2 ) original exponential logarithmic manual (a) Ccity 1 2 3 4 5 6 7 8 9 10 Number of Components 0.5 0.6 0.7 0.8 0.9 1.0 Fitness (R 2 ) original exponential logarithmic m… view at source ↗
Figure 19
Figure 19. Figure 19: Decoding performance of different monotonic [PITH_FULL_IMAGE:figures/full_fig_p022_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Cosine similarity heatmaps of attribute representations projected into the CBR subspace (above) and a [PITH_FULL_IMAGE:figures/full_fig_p023_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Cosine similarity heatmaps of attribute representations from Qwen3-8B on [PITH_FULL_IMAGE:figures/full_fig_p023_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Cosine similarity heatmaps of attribute representations from Qwen3-8B on [PITH_FULL_IMAGE:figures/full_fig_p024_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Cosine similarity heatmaps of attribute representations from Llama3-8B-Instruct on [PITH_FULL_IMAGE:figures/full_fig_p024_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Cosine similarity heatmaps of attribute representations from Llama3-8B-Instruct on [PITH_FULL_IMAGE:figures/full_fig_p025_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Each cell shows the R2 fitness score obtained from Qwen3-8B. The projection matrix learned from one context (column) is used to predict the index information of another context (row). Higher values indicate better cross-context generality. A.13 Ablation Study on Translation Vectors To further verify the effectiveness of the translation vector, we evaluate several variants that modify the translation vecto… view at source ↗
Figure 26
Figure 26. Figure 26: Performance comparison of the translation vector [PITH_FULL_IMAGE:figures/full_fig_p026_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: Performance comparison from Qwen3-8B. One possible explanation of the effectiveness is that the translation vector partially contains CBR index information. However, this explanation is only partially supported by the results. As shown in Figures 26 and 27, in some context transformations, such as “object–>city” and “job–>city”, the performance of using only the translation vector ∆c2Ñc1 is worse than dir… view at source ↗
Figure 28
Figure 28. Figure 28: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p028_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p028_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p028_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p029_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p029_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p029_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p029_34.png] view at source ↗
Figure 35
Figure 35. Figure 35: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p030_35.png] view at source ↗
Figure 36
Figure 36. Figure 36: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p030_36.png] view at source ↗
Figure 37
Figure 37. Figure 37: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p030_37.png] view at source ↗
Figure 38
Figure 38. Figure 38: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p031_38.png] view at source ↗
Figure 39
Figure 39. Figure 39: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p031_39.png] view at source ↗
Figure 40
Figure 40. Figure 40: Visualization of the CBR subspace before and after [PITH_FULL_IMAGE:figures/full_fig_p031_40.png] view at source ↗
Figure 41
Figure 41. Figure 41: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p031_41.png] view at source ↗
Figure 42
Figure 42. Figure 42: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p032_42.png] view at source ↗
Figure 43
Figure 43. Figure 43: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p032_43.png] view at source ↗
Figure 44
Figure 44. Figure 44: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p032_44.png] view at source ↗
Figure 45
Figure 45. Figure 45: Visualization of the CBR subspace before and after relation [PITH_FULL_IMAGE:figures/full_fig_p032_45.png] view at source ↗
Figure 46
Figure 46. Figure 46: Cross permutation R2 scores for index prediction on Llama3-8B-Instruct. base ablated shuffled Predicted Data base ablated shuffled Trained Data 0.95 0.89 0.88 0.89 0.96 0.82 0.93 0.90 0.94 R² Score Heatmap (city) 0.0 0.2 0.4 0.6 0.8 1.0 R² score (a) Ccity base ablated shuffled Predicted Data base ablated shuffled Trained Data 0.96 0.87 0.81 0.88 0.96 0.75 0.90 0.88 0.94 R² Score Heatmap (country) 0.0 0.2 … view at source ↗
Figure 47
Figure 47. Figure 47: Cross permutation R2 scores for index prediction on Qwen3-8B. We evaluate whether projection matrices learned under permuted conditions generalize to original setting. Specifically, we use CBR projection matrices learned from the ablated and shuffled datasets to [PITH_FULL_IMAGE:figures/full_fig_p033_47.png] view at source ↗
Figure 48
Figure 48. Figure 48: Decoding performance of rei, ris from activations of Llama3-8B-Instruct on the # separation=1 setting. 2 4 6 8 10 Number of Components 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (a) Ccity 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (b) Ccountry 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn e… view at source ↗
Figure 49
Figure 49. Figure 49: Decoding performance of rei, ris from activations of Llama3-8B-Instruct on the # separation=2 setting. 2 4 6 8 10 Number of Components 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (a) Ccity 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (b) Ccountry 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn e… view at source ↗
Figure 50
Figure 50. Figure 50: Decoding performance of rei, ris from activations of Llama3-8B-Instruct on the # separation=3 setting [PITH_FULL_IMAGE:figures/full_fig_p035_50.png] view at source ↗
Figure 51
Figure 51. Figure 51: Decoding performance of rei, ris from activations of Qwen3-8B on the # separation=1 setting. 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (a) Ccity 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (b) Ccountry 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R … view at source ↗
Figure 52
Figure 52. Figure 52: Decoding performance of rei, ris from activations of Qwen3-8B on the # separation=2 setting. 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (a) Ccity 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R 2 ) PLS PCA PLS (Random) PCA (Random) (b) Ccountry 2 4 6 8 10 Number of Components 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Fitn ess (R … view at source ↗
Figure 53
Figure 53. Figure 53: Decoding performance of rei, ris from activations of Qwen3-8B on the # separation=3 setting [PITH_FULL_IMAGE:figures/full_fig_p036_53.png] view at source ↗
Figure 54
Figure 54. Figure 54: Cross permutation (separation) R2 scores for index prediction on Llama3-8B-Instruct. base #separation=1 #separation=2 #separation=3 Predicted Data base #separation=1 #separation=2 #separation=3 Trained Data 0.95 0.87 0.78 0.80 0.93 0.93 0.88 0.88 0.87 0.89 0.93 0.91 0.88 0.87 0.89 0.93 R² Score Heatmap (city) 0.0 0.2 0.4 0.6 0.8 1.0 R² score (a) Ccity base #separation=1 #separation=2 #separation=3 Predict… view at source ↗
Figure 55
Figure 55. Figure 55: Cross permutation (separation) R2 scores for index prediction on Qwen3-8B [PITH_FULL_IMAGE:figures/full_fig_p037_55.png] view at source ↗
Figure 56
Figure 56. Figure 56: Decoding performance of rei, ris from activations of Llama3-8B-Instruct on Ccity [PITH_FULL_IMAGE:figures/full_fig_p039_56.png] view at source ↗
Figure 57
Figure 57. Figure 57: Decoding performance of rei, ris from activations of Llama3-8B-Instruct on Ccountry [PITH_FULL_IMAGE:figures/full_fig_p040_57.png] view at source ↗
Figure 58
Figure 58. Figure 58: Decoding performance of rei, ris from activations of Qwen3-8B on Ccity [PITH_FULL_IMAGE:figures/full_fig_p041_58.png] view at source ↗
Figure 59
Figure 59. Figure 59: Decoding performance of rei, ris from activations of Qwen3-8B on Ccountry [PITH_FULL_IMAGE:figures/full_fig_p042_59.png] view at source ↗
Figure 60
Figure 60. Figure 60: Cross patterns R2 scores for index prediction from Llama-8B-Instruct on Ccity. Patt.1 Patt.2 Patt.3 Patt.4 Patt.5 Patt.6 Patt.7 Patt.8 Patt.9 Patt.10 Patt.11 Patt.12 Patt.13 Predicted Data Patt.1 Patt.2 Patt.3 Patt.4 Patt.5 Patt.6 Patt.7 Patt.8 Patt.9 Patt.10 Patt.11 Patt.12 Patt.13 Trained Data 0.98 0.76 0.74 0.79 0.62 0.58 0.64 0.43 0.51 0.40 0.27 0.29 0.14 0.81 0.97 0.91 0.86 0.72 0.61 0.72 0.52 0.52 0… view at source ↗
Figure 61
Figure 61. Figure 61: Cross patterns R2 scores for index prediction from Llama-8B-Instruct on Ccountry. To further examine whether the learned representations generalize across structural variations, we conduct cross-pattern evaluations by applying a projection matrix trained on one pattern to predict indices in another. The results, shown in [PITH_FULL_IMAGE:figures/full_fig_p043_61.png] view at source ↗
Figure 62
Figure 62. Figure 62: Cross patterns R2 scores for index prediction from Qwen-8B on Ccity Patt.1 Patt.2 Patt.3 Patt.4 Patt.5 Patt.6 Patt.7 Patt.8 Patt.9 Patt.10 Patt.11 Patt.12 Patt.13 Predicted Data Patt.1 Patt.2 Patt.3 Patt.4 Patt.5 Patt.6 Patt.7 Patt.8 Patt.9 Patt.10 Patt.11 Patt.12 Patt.13 Trained Data 0.98 0.71 0.64 0.83 0.62 0.55 0.64 0.49 0.49 0.38 0.18 0.16 0.11 0.85 0.97 0.84 0.85 0.70 0.59 0.68 0.54 0.50 0.54 0.39 0.… view at source ↗
Figure 63
Figure 63. Figure 63: Cross patterns R2 scores for index prediction from Qwen-8B on Ccountry. templates. Taken together, these observations suggest that the overall structure of CBR subspace is stable across a wide range of discourse configurations. The persistence of accurate index prediction under these heterogeneous patterns indicates that the discovered structure does not arise from a particular template design, but instea… view at source ↗
Figure 64
Figure 64. Figure 64: Logit landscape of attribute predictions in the CBR subspace across context on Llama3-8B-Instruct. [PITH_FULL_IMAGE:figures/full_fig_p045_64.png] view at source ↗
Figure 65
Figure 65. Figure 65: Logit landscape of attribute predictions in the CBR subspace across context on Qwen3-8B. [PITH_FULL_IMAGE:figures/full_fig_p046_65.png] view at source ↗
Figure 66
Figure 66. Figure 66: ei (red arrow) and ri (black arrow) directions on Ccity 3 2 1 0 1 2 3 Projection value along direction (ri=4) 0 20 40 60 80 100 Logit Score (%) ei:1,ri:4 ei:2,ri:4 ei:3,ri:4 (a) ei direction with ri “ 4. 3 2 1 0 1 2 3 Projection value along direction (ri=3) 0 20 40 60 80 Logit Score (%) ei:1,ri:3 ei:2,ri:3 ei:3,ri:3 (b) ei direction with ri “ 3. 3 2 1 0 1 2 3 Projection value along direction (ri=2) 0 10 2… view at source ↗
Figure 67
Figure 67. Figure 67: Logit score curves on Ccity for Llama3-8B-Instruct. We analyze how the logit score varies as we move along the learned entity-index (ei) and relation-index (ri) directions illustrated in Figure 66a and 66b, and the resulting logit curves are shown in [PITH_FULL_IMAGE:figures/full_fig_p047_67.png] view at source ↗
Figure 68
Figure 68. Figure 68: Logit score curves on Ccity for Qwen3-8B. (a) Logit score curves along ei direction, where (ri “ 4). (b) Logit score curves along ri direction, where (ei “ 3). (c) ei (red arrow) and ri (black arrow) direc￾tions [PITH_FULL_IMAGE:figures/full_fig_p048_68.png] view at source ↗
Figure 69
Figure 69. Figure 69: Logit score curves on Crelation for Qwen3-8B [PITH_FULL_IMAGE:figures/full_fig_p048_69.png] view at source ↗
Figure 70
Figure 70. Figure 70: Effect of perturbing activations along the CBR subspace versus a random subspace on Qwen3-8B. The [PITH_FULL_IMAGE:figures/full_fig_p049_70.png] view at source ↗
Figure 71
Figure 71. Figure 71: Visualization of the CBR subspace under perturbations along CBR directions using [PITH_FULL_IMAGE:figures/full_fig_p049_71.png] view at source ↗
Figure 72
Figure 72. Figure 72: Causal intervention on the CBR subspace reveals the CBR subspace based mechanism. Steering different [PITH_FULL_IMAGE:figures/full_fig_p050_72.png] view at source ↗
Figure 73
Figure 73. Figure 73: Entity-index Steering. Context: "In a bustling market, the table[ei:1] stood out, manufactured [r:1] in Australia [ei:1 ri:1] and designed [r:2 ] in Italy[ei:1 ri:2] . It was proudly exported [r:3] to Germany [ei:1 ri:3] , yet unfortunately banned [r:4] in Mexico [ei:1 ri:4] . Nearby, a colorful brush [ei:2] caught the eye, elegantly designed [r:2] in France [ei:2 ri:2] , is crafted [r:1] in China [ei:2 r… view at source ↗
Figure 74
Figure 74. Figure 74: Last-token Steering. A.23 Activation Steering on other Setting The results for (b) Entity-index steering illustrated in [PITH_FULL_IMAGE:figures/full_fig_p050_74.png] view at source ↗
Figure 75
Figure 75. Figure 75: Activation patching via Entity-index (i.e., [PITH_FULL_IMAGE:figures/full_fig_p051_75.png] view at source ↗
Figure 76
Figure 76. Figure 76: Activation patching on the activation of the last token (i.e., [PITH_FULL_IMAGE:figures/full_fig_p051_76.png] view at source ↗
Figure 77
Figure 77. Figure 77: Activation patching on corresponding token in query part (i.e., [PITH_FULL_IMAGE:figures/full_fig_p051_77.png] view at source ↗
Figure 78
Figure 78. Figure 78: Activation patching on the activation of the last token (i.e., [PITH_FULL_IMAGE:figures/full_fig_p052_78.png] view at source ↗
Figure 79
Figure 79. Figure 79: Activation patching via Relation-index (i.e., [PITH_FULL_IMAGE:figures/full_fig_p052_79.png] view at source ↗
Figure 80
Figure 80. Figure 80: Activation patching via Relation-index (i.e., [PITH_FULL_IMAGE:figures/full_fig_p052_80.png] view at source ↗
Figure 81
Figure 81. Figure 81: Visualization of the CBR subspace for Table Template Input on Llama3-8B-Instruct. 15 10 5 0 5 10 15 20 25 PLS Component 1 10 5 0 5 10 15 PLS Component 2 Table: country (a) Ccountry 10 0 10 20 PLS Component 1 10 5 0 5 10 15 PLS Component 2 Table: relation (b) Crelation 15 10 5 0 5 10 15 20 PLS Component 1 10 5 0 5 10 15 PLS Component 2 Table: city ei:1, ri:1 ei:1, ri:2 ei:1, ri:3 ei:1, ri:4 ei:2, ri:1 ei:2… view at source ↗
Figure 82
Figure 82. Figure 82: Visualization of the CBR subspace for Table Template Input on Qwen3-8B [PITH_FULL_IMAGE:figures/full_fig_p053_82.png] view at source ↗
Figure 83
Figure 83. Figure 83: Visualization of the CBR subspace for Discourse Template Input on Llama3-8B-Instruct. 20 10 0 10 20 PLS Component 1 15 10 5 0 5 10 15 PLS Component 2 Template: country (a) Ccountry 20 10 0 10 20 PLS Component 1 15 10 5 0 5 10 15 PLS Component 2 Template: relation (b) Crelation 20 10 0 10 20 PLS Component 1 20 15 10 5 0 5 10 15 PLS Component 2 Template: city ei:1, ri:1 ei:1, ri:2 ei:1, ri:3 ei:1, ri:4 ei:2… view at source ↗
Figure 84
Figure 84. Figure 84: Visualization of the CBR subspace for Discourse Template Input on Qwen3-8B [PITH_FULL_IMAGE:figures/full_fig_p054_84.png] view at source ↗
Figure 85
Figure 85. Figure 85: Activation patching on Table Template Input in query part across five contexts on Llama3-8B-Instruct. country object job relation city 0.0 0.2 0.4 0.6 0.8 1.0 Logit Score Original (Before) Original (After) Expected (Before) Expected (After) (a) Relation-index (i.e., ri) steering on the attribute token. country object job relation city 0.0 0.2 0.4 0.6 0.8 1.0 Logit Score Original (Before) Original (After) … view at source ↗
Figure 86
Figure 86. Figure 86: Activation patching on Table Template Input in query part across five contexts on Qwen3-8B. country object job relation city 0.0 0.2 0.4 0.6 0.8 1.0 Logit Score Original (Before) Original (After) Expected (Before) Expected (After) (a) Relation-index (i.e., ri) steering on the attribute token. country object job relation city 0.0 0.2 0.4 0.6 0.8 Logit Score Original (Before) Original (After) Expected (Befo… view at source ↗
Figure 87
Figure 87. Figure 87: Activation patching on Discourse Template Input in query part across five contexts on Llama3-8B￾Instruct. A.28 Comparison with Hessian-Based Binding Analysis Identifying structured subspaces that encode specific functions (e.g., binding) enables more precise monitoring and intervention in model behavior than prompt-level manipulation alone. Feng et al. (2024) propose a Hessian-based algorithm to monitor l… view at source ↗
Figure 88
Figure 88. Figure 88: Activation patching on Table Template Input in query part across five contexts on Qwen3-8B. Condition Prompt Hessian CBR (ours) pro 1.00: 0.93: 0.95 anti 0.56: 0.83: 0.94 [PITH_FULL_IMAGE:figures/full_fig_p056_88.png] view at source ↗
Figure 89
Figure 89. Figure 89: Relationship between contextual similarity and CBR decoding performance ( [PITH_FULL_IMAGE:figures/full_fig_p057_89.png] view at source ↗
Figure 90
Figure 90. Figure 90: Visualization of the CBR related heads on Llama3-8b-Instruct. Each cell shows the normalized logit [PITH_FULL_IMAGE:figures/full_fig_p058_90.png] view at source ↗
Figure 91
Figure 91. Figure 91: Visualization of the CBR related heads on Qwen3-8b. [PITH_FULL_IMAGE:figures/full_fig_p059_91.png] view at source ↗
Figure 92
Figure 92. Figure 92: CBR related head knockout on Llama3-8b-Instruct, where heads are ablated in descending order of [PITH_FULL_IMAGE:figures/full_fig_p059_92.png] view at source ↗
Figure 93
Figure 93. Figure 93: CBR related head knockout on Qwen3-8b [PITH_FULL_IMAGE:figures/full_fig_p060_93.png] view at source ↗
read the original abstract

Understanding a discourse requires tracking entities and the relations that hold between them. While Large Language Models (LLMs) perform well on relational reasoning, the mechanism by which they bind entities, relations, and attributes remains unclear. We study discourse-level relational binding and show that LLMs encode it via a Cell-based Binding Representation (CBR): a low-dimensional linear subspace in which each ``cell'' corresponds to an entity--relation index pair, and bound attributes are retrieved from the corresponding cell during inference. Using controlled multi-sentence data annotated with entity and relation indices, we identify the CBR subspace by decoding these indices from attribute-token activations with Partial Least Squares regression. Across domains and two model families, the indices are linearly decodable and form a grid-like geometry in the projected space. We further find that context-specific CBR representations are related by translation vectors in activation space, enabling cross-context transfer. Finally, activation patching shows that manipulating this subspace systematically changes relational predictions and that perturbing it disrupts performance, providing causal evidence that LLMs rely on CBR for relational binding.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that LLMs encode discourse-level relational binding via a Cell-based Binding Representation (CBR): a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair and bound attributes are retrieved from the corresponding cell. Using controlled multi-sentence discourses annotated with entity and relation indices, the authors identify the CBR subspace via Partial Least Squares regression on attribute-token activations. They report that the indices are linearly decodable and exhibit grid-like geometry in the projected space, that context-specific CBR representations are related by translation vectors, and that activation patching of the subspace systematically alters relational predictions, providing causal evidence that LLMs rely on CBR for relational binding. Results are shown across domains and two model families.

Significance. If the central claim holds, the work provides a concrete mechanistic hypothesis for relational binding in LLMs, moving beyond generic linear probes to a structured, cell-based representation that supports both decoding and causal intervention. Strengths include the use of controlled annotated data to isolate the mechanism, the geometric analysis revealing grid structure and translation vectors for cross-context transfer, and the activation patching experiments that supply causal evidence within the tested setting. These elements together offer converging empirical support that could inform interpretability methods and targeted improvements to discourse reasoning in language models.

major comments (2)
  1. [§5] §5 (activation patching): The patching interventions that zero or shift the identified subspace and alter relational outputs are performed exclusively inside the controlled, annotated data distribution used to locate the CBR via PLS. This leaves open whether the subspace functions as the binding mechanism on ordinary unannotated text whose entity-relation structure is implicit rather than explicitly constructed to match the probe labeling scheme.
  2. [§4] §4 (subspace identification and geometry): Linear decodability and grid geometry are demonstrated on data whose entity-relation indices were explicitly annotated and used as regression targets. Without controls that permute the index labels or evaluate the same subspace on natural discourses lacking such annotations, it remains possible that the observed structure is an artifact of the labeling procedure rather than the representation actually used for binding during inference.
minor comments (2)
  1. [Methods] The exact method for choosing subspace dimensionality (listed as a free parameter) and any accompanying ablation or cross-validation results should be reported in the methods section to allow readers to assess sensitivity.
  2. [Figure 2 / §4.1] Figure captions and the geometry analysis section would benefit from quantitative metrics (e.g., grid regularity scores or nearest-neighbor consistency) in addition to the visual projections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help sharpen the scope of our claims about Cell-Based Binding Representations. We address each major point below, offering the strongest honest defense of the current evidence while noting where revisions can strengthen the manuscript.

read point-by-point responses
  1. Referee: [§5] §5 (activation patching): The patching interventions that zero or shift the identified subspace and alter relational outputs are performed exclusively inside the controlled, annotated data distribution used to locate the CBR via PLS. This leaves open whether the subspace functions as the binding mechanism on ordinary unannotated text whose entity-relation structure is implicit rather than explicitly constructed to match the probe labeling scheme.

    Authors: We agree that the activation patching experiments are performed exclusively on the controlled, annotated discourses. This design is required to enable precise targeting of specific entity-relation index pairs and to isolate the effect of the subspace from other factors. The systematic changes in relational predictions upon zeroing or shifting the subspace supply causal evidence that the identified CBR is used for binding within these discourses. Extending the same interventions to ordinary unannotated text would require new methods to locate the relevant cells without explicit labels and is left for future work. revision: no

  2. Referee: [§4] §4 (subspace identification and geometry): Linear decodability and grid geometry are demonstrated on data whose entity-relation indices were explicitly annotated and used as regression targets. Without controls that permute the index labels or evaluate the same subspace on natural discourses lacking such annotations, it remains possible that the observed structure is an artifact of the labeling procedure rather than the representation actually used for binding during inference.

    Authors: The annotations are used only as regression targets to identify the subspace; the grid geometry and translation vectors are emergent properties of the model's activations. Their consistency across domains and model families, together with the functional role shown by patching, makes a pure labeling artifact unlikely. To address the concern directly, we will add label-permutation controls in the revision to confirm that the observed structure and decodability do not arise from the specific annotation scheme. revision: partial

Circularity Check

0 steps flagged

No significant circularity: empirical measurements and interventions are independent of fitted inputs

full rationale

The paper identifies the proposed CBR subspace via PLS regression on attribute-token activations from specially constructed annotated discourses, then reports linear decodability, grid geometry, translation vectors, and causal effects from activation patching. None of these steps reduce by the paper's own equations or self-citations to a fitted quantity renamed as a prediction; the central claim rests on measured decodability and intervention outcomes rather than any self-definitional loop or load-bearing self-citation. The work is self-contained against external benchmarks and contains no instances of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim rests on the interpretation that linear decodability plus patching effects reveal the model's actual binding mechanism, plus the assumption that the specially annotated multi-sentence data elicits the same representations used in natural discourse.

free parameters (1)
  • subspace dimensionality
    The low-dimensional projection used to define the CBR cells is selected via PLS; the exact number of components is a modeling choice that affects the grid geometry observed.
axioms (2)
  • domain assumption Internal activations of LLMs contain information about entity and relation indices that is linearly extractable
    Invoked when using PLS regression to identify the CBR subspace from attribute-token activations.
  • domain assumption The controlled multi-sentence texts with explicit entity-relation indices elicit the same binding mechanisms used in natural language
    Required to generalize the identified CBR from the experimental data to general discourse processing.
invented entities (1)
  • Cell-based Binding Representation (CBR) no independent evidence
    purpose: A low-dimensional linear subspace whose cells store and retrieve bound attributes for specific entity-relation pairs
    Introduced to explain the observed linear decodability, grid geometry, and patching effects; no independent falsifiable prediction (e.g., predicted activation pattern in a new model) is provided outside the current experiments.

pith-pipeline@v0.9.0 · 5479 in / 1632 out tokens · 54118 ms · 2026-05-10T02:16:49.476337+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

77 extracted references · 30 canonical work pages · 8 internal anchors

  1. [1]

    Aho and Jeffrey D

    Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

  2. [2]

    Publications Manual , year = "1983", publisher =

  3. [3]

    Chandra and Dexter C

    Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

  4. [4]

    Scalable training of

    Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

  5. [5]

    Dan Gusfield , title =. 1997

  6. [6]

    Tetreault , title =

    Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

  7. [7]

    A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

    Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

  8. [8]

    Notes from the linguistic underground , pages=

    Discourse referents , author=. Notes from the linguistic underground , pages=. 1976 , publisher=

  9. [9]

    Semantics Critical Concepts in Linguistics , pages=

    File change semantics and the familiarity theory of definiteness , author=. Semantics Critical Concepts in Linguistics , pages=

  10. [10]

    Cognition , volume=

    Connectionism and cognitive architecture: A critical analysis , author=. Cognition , volume=. 1988 , publisher=

  11. [11]

    Current opinion in neurobiology , volume=

    The binding problem , author=. Current opinion in neurobiology , volume=. 1996 , publisher=

  12. [12]

    PLS-regression: a basic tool of chemometrics Chemometr , author=. Intell. Lab , volume=

  13. [13]

    Journal of cognitive neuroscience , volume=

    When peanuts fall in love: N400 evidence for the power of discourse , author=. Journal of cognitive neuroscience , volume=. 2006 , publisher=

  14. [14]

    Computational Linguistics , volume=

    Modeling local coherence: An entity-based approach , author=. Computational Linguistics , volume=. 2008 , publisher=

  15. [15]

    Handbook of Philosophical Logic: Volume 15 , pages=

    Discourse representation theory , author=. Handbook of Philosophical Logic: Volume 15 , pages=. 2010 , publisher=

  16. [16]

    Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages=

    Transformer feed-forward layers are key-value memories , author=. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages=

  17. [17]

    Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Knowledge neurons in pretrained transformers , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  18. [18]

    Transformer Circuits Thread , volume=

    A mathematical framework for transformer circuits , author=. Transformer Circuits Thread , volume=

  19. [19]

    arXiv preprint arXiv:2109.04727 , year=

    A simple and effective method to eliminate the self language bias in multilingual representations , author=. arXiv preprint arXiv:2109.04727 , year=

  20. [20]

    Advances in Neural Information Processing Systems , volume=

    Interpretability at scale: Identifying causal mechanisms in alpaca , author=. Advances in Neural Information Processing Systems , volume=

  21. [21]

    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

    Dissecting recall of factual associations in auto-regressive language models , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

  22. [22]

    arXiv preprint arXiv:2308.09124 , year=

    Linearity of relation decoding in transformer language models , author=. arXiv preprint arXiv:2308.09124 , year=

  23. [23]

    Probing for the Usage of Grammatical Number

    Lasri, Karim and Pimentel, Tiago and Lenci, Alessandro and Poibeau, Thierry and Cotterell, Ryan. Probing for the Usage of Grammatical Number. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.603

  24. [24]

    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Entity tracking in language models , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  25. [25]

    How do language models bind entities in context? In International Conference on Learning Representations, 2024

    How do language models bind entities in context? , author=. arXiv preprint arXiv:2310.17191 , year=

  26. [26]

    A mechanism for solving relational tasks in transformer language models , author=

  27. [27]

    Fine-tuning enhances existing mechanisms: A case study on entity tracking

    Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking , author=. arXiv preprint arXiv:2402.14811 , year=

  28. [28]

    Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages=

    Monotonic representation of numeric attributes in language models , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages=

  29. [29]

    The Hydra Effect: Emergent Self-repair in Language Model Computations , journal =

    The hydra effect: Emergent self-repair in language model computations , author=. arXiv preprint arXiv:2307.15771 , year=

  30. [30]

    Advances in neural information processing systems , volume=

    Investigating gender bias in language models using causal mediation analysis , author=. Advances in neural information processing systems , volume=

  31. [31]

    arXiv preprint arXiv:2210.13382 , year=

    Emergent world representations: Exploring a sequence model trained on a synthetic task , author=. arXiv preprint arXiv:2210.13382 , year=

  32. [32]

    arXiv preprint arXiv:2301.06758 , year=

    Tracing and Manipulating Intermediate Values in Neural Math Problem Solvers , author=. arXiv preprint arXiv:2301.06758 , year=

  33. [33]

    Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

    Interpretability in the wild: a circuit for indirect object identification in gpt-2 small , author=. arXiv preprint arXiv:2211.00593 , year=

  34. [34]

    Advances in Neural Information Processing Systems , volume=

    Locating and editing factual associations in GPT , author=. Advances in Neural Information Processing Systems , volume=

  35. [35]

    International Conference on Machine Learning , pages=

    Inducing causal structure for interpretable neural networks , author=. International Conference on Machine Learning , pages=. 2022 , organization=

  36. [36]

    2023 , archivePrefix=

    Towards best practices of activation patching in language models: Metrics and methods , author=. arXiv preprint arXiv:2309.16042 , year=

  37. [37]

    Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP , pages=

    Emergent linear representations in world models of self-supervised sequence models , author=. Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP , pages=

  38. [38]

    The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

    The geometry of truth: Emergent linear structure in large language model representations of true/false datasets , author=. arXiv preprint arXiv:2310.06824 , year=

  39. [39]

    arXiv preprint arXiv:2310.15154 , year=

    Linear representations of sentiment in large language models , author=. arXiv preprint arXiv:2310.15154 , year=

  40. [40]

    Steering Language Models With Activation Engineering

    Activation addition: Steering language models without optimization , author=. arXiv preprint arXiv:2308.10248 , year=

  41. [41]

    Advances in Neural Information Processing Systems , volume=

    Towards automated circuit discovery for mechanistic interpretability , author=. Advances in Neural Information Processing Systems , volume=

  42. [42]

    2024 , journal =

    Language models represent space and time , author=. arXiv preprint arXiv:2310.02207 , year=

  43. [43]

    2023 , Eprint =

    Hugo Touvron and Louis Martin and Kevin Stone and Peter Albert and Amjad Almahairi and Yasmine Babaei and Nikolay Bashlykov and Soumya Batra and Prajjwal Bhargava and Shruti Bhosale and Dan Bikel and Lukas Blecher and Cristian Canton Ferrer and Moya Chen and Guillem Cucurull and David Esiobu and Jude Fernandes and Jeremy Fu and Wenyin Fu and Brian Fuller ...

  44. [44]

    arXiv preprint arXiv:2403.00745 , year=

    AtP*: An efficient and scalable method for localizing LLM behaviour to components , author=. arXiv preprint arXiv:2403.00745 , year=

  45. [45]

    and Clark, Kevin and Hewitt, John and Khandelwal, Urvashi and Levy, Omer , year =

    Manning, Christopher D. and Clark, Kevin and Hewitt, John and Khandelwal, Urvashi and Levy, Omer. Emergent linguistic structure in artificial neural networks trained by self-supervision. Proceedings of the National Academy of Sciences. 2020. doi:10.1073/pnas.1907367117

  46. [46]

    Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: Human language technologies , pages=

    Linguistic regularities in continuous space word representations , author=. Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: Human language technologies , pages=

  47. [47]

    Transactions of the Association for Computational Linguistics , volume=

    A latent variable model approach to pmi-based word embeddings , author=. Transactions of the Association for Computational Linguistics , volume=. 2016 , publisher=

  48. [48]

    Toy Models of Superposition

    Toy models of superposition , author=. arXiv preprint arXiv:2209.10652 , year=

  49. [49]

    The Linear Representation Hypothesis and the Geometry of Large Language Models

    The linear representation hypothesis and the geometry of large language models , author=. arXiv preprint arXiv:2311.03658 , year=

  50. [50]

    Monitoring

    Monitoring latent world states in language models with propositional probes , author=. arXiv preprint arXiv:2406.19501 , year=

  51. [51]

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

    Representational analysis of binding in language models , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

  52. [52]

    In-context Learning and Induction Heads

    In-context learning and induction heads , author=. arXiv preprint arXiv:2209.11895 , year=

  53. [53]

    Proceedings of the third blackboxnlp workshop on analyzing and interpreting neural networks for NLP , pages=

    Neural natural language inference models partially embed theories of lexical entailment and negation , author=. Proceedings of the third blackboxnlp workshop on analyzing and interpreting neural networks for NLP , pages=

  54. [54]

    Advances in Neural Information Processing Systems , volume=

    Causal abstractions of neural networks , author=. Advances in Neural Information Processing Systems , volume=

  55. [55]

    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

    A mechanistic interpretation of arithmetic reasoning in language models using causal mediation analysis , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

  56. [56]

    Advances in Neural Information Processing Systems , volume=

    How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model , author=. Advances in Neural Information Processing Systems , volume=

  57. [57]

    Meaning and the Dynamics of Interpretation , pages=

    A theory of truth and semantic representation , author=. Meaning and the Dynamics of Interpretation , pages=. 2013 , publisher=

  58. [58]

    1982 , publisher=

    The semantics of definite and indefinite noun phrases , author=. 1982 , publisher=

  59. [59]

    The Llama 3 Herd of Models

    The Llama 3 Herd of Models , author=. arXiv preprint arXiv:2407.21783 , year=

  60. [60]

    Qwen3 Technical Report

    Qwen3 Technical Report , author =. arXiv preprint arXiv:2505.09388 , year =

  61. [61]

    Computational Geometry: An Introduction , author =

  62. [62]

    Haiyan Zhao, Heng Zhao, Bo Shen, Ali Payani, Fan Yang, and Mengnan Du

    Beyond single concept vector: Modeling concept subspace in llms with gaussian distribution , author=. arXiv preprint arXiv:2410.00153 , year=

  63. [63]

    arXiv preprint arXiv:2506.02996 , year=

    Linear Spatial World Models Emerge in Large Language Models , author=. arXiv preprint arXiv:2506.02996 , year=

  64. [64]

    arXiv preprint arXiv:2507.09709 , year=

    Large language models encode semantics in low-dimensional linear subspaces , author=. arXiv preprint arXiv:2507.09709 , year=

  65. [65]

    The geometry of numerical reasoning: Language models compare numeric properties in linear subspaces , author=. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers) , pages=

  66. [66]

    arXiv preprint arXiv:2503.09066 , year=

    Probing latent subspaces in llm for ai security: Identifying and manipulating adversarial states , author=. arXiv preprint arXiv:2503.09066 , year=

  67. [67]

    and Millman, K

    Harris, Charles R. and Millman, K. Jarrod and van der Walt, St. Array programming with. Nature , volume=

  68. [68]

    Scikit-learn: Machine Learning in

    Pedregosa, Fabian and Varoquaux, Ga. Scikit-learn: Machine Learning in. Journal of Machine Learning Research , volume=

  69. [69]

    Paszke, Adam and Gross, Sam and Massa, Francisco and others , journal=

  70. [70]

    Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations , pages=

    Transformers: State-of-the-Art Natural Language Processing , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations , pages=

  71. [71]

    Computing in Science & Engineering , volume=

    Matplotlib: A 2D Graphics Environment , author=. Computing in Science & Engineering , volume=

  72. [72]

    arXiv preprint arXiv:2406.19384 , year=

    The remarkable robustness of llms: Stages of inference? , author=. arXiv preprint arXiv:2406.19384 , year=

  73. [73]

    1979 , edition =

    A Formal Approach to Discourse Anaphora , editor =. 1979 , edition =. doi:10.4324/9781315403342 , isbn =

  74. [74]

    1983 , publisher=

    Strategies of discourse comprehension , author=. 1983 , publisher=

  75. [75]

    , author=

    Situation models in language comprehension and memory. , author=. Psychological bulletin , volume=. 1998 , publisher=

  76. [76]

    arXiv preprint arXiv:2510.06182 , year=

    Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context , author=. arXiv preprint arXiv:2510.06182 , year=

  77. [77]

    Proceedings of the 2022 conference on empirical methods in natural language processing , pages=

    Revisiting DocRED-addressing the false negative problem in relation extraction , author=. Proceedings of the 2022 conference on empirical methods in natural language processing , pages=