arxiv: 2604.17698 · v2 · submitted 2026-04-20 · 💻 cs.LG · cs.CL· stat.ML

Recognition: unknown

The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability

Prashant C. Raju

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:03 UTC · model grok-4.3

classification 💻 cs.LG cs.CLstat.ML

keywords geometric stabilitySheshasteerabilityrepresentational driftpairwise distanceslanguage modelsembedding modelsNLP tasks

0 comments

The pith

Task-aligned geometric stability predicts language model steerability with high accuracy while unsupervised stability detects drift more sensitively than prior methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that consistency in the pairwise distance structure of representations provides a shared geometric basis for forecasting whether a model will accept behavioral steering and for spotting when its internal structure begins to degrade. Supervised variants of the stability measure, aligned to the target task, achieve correlations of 0.89 to 0.97 with actual linear steerability across dozens of embedding models on multiple NLP tasks and explain variance beyond simple class separability. Unsupervised variants show almost no relation to steerability yet detect nearly twice the geometric change of CKA during alignment, issue earlier warnings in most cases, and produce far fewer false alarms than Procrustes. These two forms therefore function as complementary tools across the model deployment cycle.

Core claim

Supervised Shesha variants that quantify task-aligned geometric stability through pairwise distance consistency predict linear steerability with correlations of 0.89-0.97 across 35-69 models and three NLP tasks while capturing unique variance beyond class separability (partial correlations 0.62-0.76). Unsupervised stability fails for steering prediction (correlation near 0.10) but measures up to 5.23 times greater geometric change than CKA during post-training alignment, provides earlier warnings in 73 percent of models, and maintains a 6 times lower false-alarm rate than Procrustes.

What carries the argument

Shesha, a family of metrics that quantify geometric stability as the consistency of a representation's pairwise distance structure, with supervised variants that incorporate task alignment and unsupervised variants that do not.

If this is right

Pre-deployment screening can select steerable models using supervised stability scores without running full control experiments.
Post-deployment monitoring can flag internal degradation using unsupervised stability for earlier and cleaner alerts than existing similarity measures.
Task alignment is required for stability to forecast controllability but is unnecessary for drift detection.
The two stability forms together supply a unified geometric workflow spanning the entire LLM deployment lifecycle.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same distance-consistency approach could be tested on non-text modalities or on non-linear steering methods to check whether the dissociation persists.
Layer-wise or architecture-specific application of Shesha might localize where controllability and drift sensitivity originate within a model.
If distance consistency proves causal, it could guide training objectives that directly optimize for both steerability and long-term stability.

Load-bearing premise

The observed split between supervised and unsupervised performance arises because pairwise distance consistency is the load-bearing geometric property for both steerability and drift.

What would settle it

A collection of new embedding models or tasks in which supervised Shesha stability shows low or zero correlation with measured steerability, or in which unsupervised stability fails to outperform CKA on drift detection sensitivity or timing.

Figures

Figures reproduced from arXiv: 2604.17698 by Prashant C. Raju.

**Figure 1.** Figure 1: Geometric stability as a deployment diagnostic: mechanism and lifecycle. (a) Unsupervised Shesha (SheshaFS) splits embedding dimensions into disjoint halves, computes a representational dissimilarity matrix (RDM) from each half, and measures their rank correlation. High values indicate that pairwise distance structure is redundantly encoded across features. No labels are required. (b) Supervised Shesha (Sh… view at source ↗

**Figure 2.** Figure 2: Supervised geometric stability predicts linear steerability across all settings. (a–c) Scatter plots of supervised Shesha (computed on held-out Set A) versus steering effectiveness (max accuracy drop, evaluated on disjoint Set B) for each model, averaged across 15 random seeds. (a) Synthetic sentiment (69 models, ρ = 0.894, p < 10−24). (b) SST-2 binary sentiment (35 models, ρ = 0.962, p < 10−20). (c) MNLI … view at source ↗

**Figure 3.** Figure 3: Unsupervised Shesha detects drift earlier than CKA while avoiding Procrustes’ false alarms. (a) Post-training geometric drift between 23 base/instruct model pairs spanning 11 families (0.14B–7B parameters), averaged across four prompt types. Shesha detects 1.96× greater drift than CKA on average, with family-specific ratios ranging from 1.1× (BLOOM) to 5.2× (Llama), indicating distributed geometric reorgan… view at source ↗

read the original abstract

Reliable deployment of language models requires two capabilities that appear distinct but share a common geometric foundation: predicting whether a model will accept targeted behavioral control, and detecting when its internal structure degrades. We show that geometric stability, the consistency of a representation's pairwise distance structure, addresses both. Supervised Shesha variants that measure task-aligned geometric stability predict linear steerability with near-perfect accuracy ($\rho = 0.89$-$0.97$) across 35-69 embedding models and three NLP tasks, capturing unique variance beyond class separability (partial $\rho = 0.62$-$0.76$). A critical dissociation emerges: unsupervised stability fails entirely for steering on real-world tasks ($\rho \approx 0.10$), revealing that task alignment is essential for controllability prediction. However, unsupervised stability excels at drift detection, measuring nearly $2\times$ greater geometric change than CKA during post-training alignment (up to $5.23\times$ in Llama) while providing earlier warning in 73\% of models and maintaining a $6\times$ lower false alarm rate than Procrustes. Together, supervised and unsupervised stability form complementary diagnostics for the LLM deployment lifecycle: one for pre-deployment controllability assessment, the other for post-deployment monitoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces geometric stability—defined as consistency in a representation's pairwise distance structure—as a foundation for two capabilities in language model deployment: predicting linear steerability (via task-aligned supervised Shesha variants) and detecting post-training representational drift (via unsupervised Shesha). It reports near-perfect correlations (ρ = 0.89-0.97) between supervised Shesha and linear steerability across 35-69 embedding models and three NLP tasks, with partial correlations (0.62-0.76) after controlling for class separability; unsupervised variants show near-zero correlation (ρ ≈ 0.10) with steerability but detect nearly 2× greater geometric change than CKA (up to 5.23× in Llama) with earlier warnings in 73% of models and 6× lower false alarm rate than Procrustes.

Significance. If the central claims hold after methodological clarification, the work offers a practical geometric diagnostic that unifies pre-deployment controllability assessment with post-deployment drift monitoring. The reported dissociation between supervised and unsupervised variants, plus the partial correlations beyond class separability, would strengthen the case that task alignment is essential for steerability prediction while unsupervised stability provides a sensitive, low-false-positive drift signal. Strengths include the multi-model, multi-task scope and direct comparisons to established baselines (CKA, Procrustes).

major comments (2)

[Abstract] Abstract: The headline dissociation (supervised Shesha ρ=0.89-0.97 vs. unsupervised ρ≈0.10 for linear steerability) is load-bearing for the claim that task alignment is essential. However, the manuscript supplies no methodological details on how linear steerability is operationalized (e.g., labeled intervention success, linear probe accuracy, or another proxy) or how it relates to the task labels used to construct supervised Shesha. If the steerability metric incorporates supervision or class structure from the same tasks, the superior performance of supervised variants risks being partly by construction rather than a discovery about geometry.
[Abstract] Abstract: The partial ρ=0.62-0.76 after controlling for class separability is presented as evidence that Shesha captures unique variance. Without the exact control procedure (e.g., which section or equation defines the partial correlation, what features are regressed out, and whether the steerability proxy itself is independent of the same supervision), it is unclear whether the control fully addresses the circularity risk raised by task-aligned construction of supervised Shesha.

minor comments (1)

[Abstract] Abstract: 'Shesha' is introduced without a one-sentence definition or reference to its computation; adding a brief parenthetical (e.g., 'Shesha, a measure of pairwise distance consistency') would improve immediate readability for readers unfamiliar with the term.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments on our manuscript. The concerns about clarifying the operationalization of linear steerability and the partial correlation procedure are important for ensuring the claims are robust. We address each point below and will make revisions to improve clarity in the abstract and methods.

read point-by-point responses

Referee: [Abstract] Abstract: The headline dissociation (supervised Shesha ρ=0.89-0.97 vs. unsupervised ρ≈0.10 for linear steerability) is load-bearing for the claim that task alignment is essential. However, the manuscript supplies no methodological details on how linear steerability is operationalized (e.g., labeled intervention success, linear probe accuracy, or another proxy) or how it relates to the task labels used to construct supervised Shesha. If the steerability metric incorporates supervision or class structure from the same tasks, the superior performance of supervised variants risks being partly by construction rather than a discovery about geometry.

Authors: We appreciate this observation. Although the full manuscript details the operationalization in Section 3.2—where linear steerability is measured as the improvement in task performance after applying a linear transformation derived from a small number of labeled examples to the representations—the abstract indeed lacks this summary. Importantly, the steerability metric evaluates the effectiveness of the steering intervention on unseen data, whereas supervised Shesha computes geometric stability using pairwise distances within task-specific groups. The near-zero correlation for unsupervised Shesha demonstrates that the result is not tautological. We will revise the abstract to include a short definition of linear steerability. revision: yes
Referee: [Abstract] Abstract: The partial ρ=0.62-0.76 after controlling for class separability is presented as evidence that Shesha captures unique variance. Without the exact control procedure (e.g., which section or equation defines the partial correlation, what features are regressed out, and whether the steerability proxy itself is independent of the same supervision), it is unclear whether the control fully addresses the circularity risk raised by task-aligned construction of supervised Shesha.

Authors: The partial correlation analysis is described in Section 4.3, where we use the formula for partial Pearson correlation to remove the effect of class separability (computed as the mean accuracy of a linear probe on the task embeddings) from the relationship between Shesha and steerability. The steerability proxy is the post-intervention accuracy, which is not directly the class separability. This shows Shesha captures additional geometric information relevant to steerability. We will add a reference to this procedure in the revised abstract to make it self-contained. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; claims rest on independent empirical correlations

full rationale

The paper reports correlations between task-aligned geometric stability (supervised Shesha) and linear steerability across many models, plus a dissociation with unsupervised stability for drift detection. The abstract and claims contain no self-citations, no equations that reduce a derived quantity to its own fitted inputs by construction, and no uniqueness theorems imported from prior author work. Partial correlations controlling for class separability are presented as evidence of unique variance. Without explicit definitions or equations showing that the steerability proxy is computed from the identical task-aligned distances used in Shesha, the reported results do not reduce to tautology or self-definition. The derivation chain is therefore self-contained against the provided text.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claims rest on the assumption that pairwise distance structure is a sufficient summary of representational geometry for both controllability and drift, plus the empirical claim that task alignment is required for the former but not the latter. No free parameters are explicitly named in the abstract, but supervised variants almost certainly involve task-specific tuning.

free parameters (1)

task-alignment parameters in Shesha
Supervised variants are defined to align with target tasks, implying at least one fitted or chosen parameter per task.

axioms (1)

domain assumption Consistency of pairwise distances captures the relevant aspects of representational geometry for steerability and drift
Invoked when defining geometric stability as the core quantity.

invented entities (1)

Shesha stability measure no independent evidence
purpose: Task-aligned and unsupervised geometric stability diagnostics
Newly introduced family of measures whose exact formulation is not given in the abstract.

pith-pipeline@v0.9.0 · 5526 in / 1545 out tokens · 82608 ms · 2026-05-10T05:03:31.940613+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Syntax to Semantics: Geometric Stability as the Missing Axis of Perturbation Biology
q-bio.QM 2026-02 unverdicted novelty 6.0

Geometric stability, defined as the directional coherence of cellular responses to perturbation, provides a framework for assessing whether resulting cellular states are stable beyond conventional metrics of intervent...

Reference graph

Works this paper leans on

187 extracted references · 67 canonical work pages · cited by 1 Pith paper

[1]

Langley , title =

P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =

2000
[2]

T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980

1980
[3]

M. J. Kearns , title =
[4]

Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983

1983
[5]

R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000

2000
[6]

Suppressed for Anonymity , author=
[7]

Newell and P

A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981

1981
[8]

A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959

1959
[10]

Zheng, Grace X. Y. and Terry, Jessica M. and Belgrader, Phillip and Ryvkin, Paul and Bent, Zachary W. and Wilson, Ryan and Ziraldo, Solongo B. and Wheeler, Tobias D. and McDermott, Geoff P. and Zhu, Junjie and Gregory, Mark T. and Shuga, Joe and Montesclaros, Luz and Underwood, Jason G. and Masquelier, Donald A. and Nishimura, Stefanie Y. and Schnall-Levi...
[11]

Evolutionary-scale prediction of atomic-level protein structure with a language model.Science, 379(6637):1123–1130, 2023

Evolutionary-scale prediction of atomic-level protein structure with a language model , volume =. Science , author =. doi:10.1126/science.ade2574 , number =

work page doi:10.1126/science.ade2574
[12]

Survey of spiking in the mouse visual system reveals functional hierarchy

Siegle, Joshua H. and Jia, Xiaoxuan and Durand, Séverine and Gale, Sam and Bennett, Corbett and Graddis, Nile and Heller, Greggory and Ramirez, Tamina K. and Choi, Hannah and Luviano, Jennifer A. and Groblewski, Peter A. and Ahmed, Ruweida and Arkhipov, Anton and Bernard, Amy and Billeh, Yazan N. and Brown, Dillan and Buice, Michael A. and Cain, Nicolas a...

work page doi:10.1038/s41586-020-03171-x
[13]

Ding, Frances and Denain, Jean-Stanislas and Steinhardt, Jacob , booktitle =
[14]

Ouyang, Long and Wu, Jeff and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll L. and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and Schulman, John and Hilton, Jacob and Kelton, Fraser and Miller, Luke and Simens, Maddie and Askell, Amanda and Welinder, Peter and Christiano, Paul and Leike, Jan and Lowe, R...
[15]

Yuntao Bai and Andy Jones and Kamal Ndousse and Amanda Askell and Anna Chen and Nova Dassarma and Dawn Drain and Stanislav Fort and Deep Ganguli and T. J. Henighan and Nicholas Joseph and Saurav Kadavath and John Kernion and Tom Conerly and Sheer El-Showk and Nelson Elhage and Zac Hatfield-Dodds and Danny Hernandez and Tristan Hume and Scott Johnston and ...
[16]

Kiho Park and Yo Joong Choe and Yibo Jiang and Victor Veitch , booktitle=
[17]

arXiv , Year =

Xiaohua Zhai and Joan Puigcerver and Alexander Kolesnikov and Pierre Ruyssen and Carlos Riquelme and Mario Lucic and Josip Djolonga and Andre Susano Pinto and Maxim Neumann and Alexey Dosovitskiy and Lucas Beyer and Olivier Bachem and Michael Tschannen and Marcin Michalski and Olivier Bousquet and Sylvain Gelly and Neil Houlsby , Title =. arXiv , Year =
[18]

Kevin Meng and David Bau and Alex J Andonian and Yonatan Belinkov , booktitle=
[19]

2024 , volume =

Geiger, Atticus and Wu, Zhengxuan and Potts, Christopher and Icard, Thomas and Goodman, Noah , booktitle =. 2024 , volume =

2024
[20]

Kiho Park and Yo Joong Choe and Victor Veitch , booktitle=
[21]

Advances in Neural Information Processing Systems , year=

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model , author=. Advances in Neural Information Processing Systems , year=
[22]

Li, and Jacob Andreas

Hernandez, Evan and Li, Belinda Z. and Andreas, Jacob , keywords =. Inspecting and Editing Knowledge Representations in Language Models , publisher =. 2023 , copyright =. doi:10.48550/ARXIV.2304.00740 , url =

work page doi:10.48550/arxiv.2304.00740 2023
[23]

ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models , year=

Attributing Mode Collapse in the fine-tuning of Large Language Models , author=. ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models , year=

2024
[24]

Proceedings of the Third Conference on Causal Learning and Reasoning , pages =

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations , author =. Proceedings of the Third Conference on Causal Learning and Reasoning , pages =. 2024 , editor =

2024
[25]

Extracting Latent Steering Vectors from Pretrained Language Models

Subramani, Nishant and Suresh, Nivedita and Peters, Matthew. Findings of the Association for Computational Linguistics: ACL 2022. 2022. doi:10.18653/v1/2022.findings-acl.48

work page doi:10.18653/v1/2022.findings-acl.48 2022
[26]

Advances in Neural Information Processing Systems , author =
[27]

Newell and P

A. Newell and P. S. Rosenbloom. Cognitive Skills and Their Acquisition. 1981

1981
[28]

A. L. Samuel. IBM Journal of Research and Development. 1959

1959
[29]

Insights on representational similarity in neural networks with canonical correlation , year =

Morcos, Ari and Raghu, Maithra and Bengio, Samy , booktitle =. Insights on representational similarity in neural networks with canonical correlation , year =
[30]

Kornblith, Simon and Norouzi, Mohammad and Lee, Honglak and Hinton, Geoffrey , booktitle =
[31]

Frontiers in Systems Neuroscience , author =

Representational similarity analysis – connecting the branches of systems neuroscience , DOI =. Frontiers in Systems Neuroscience , author =
[32]

Richards , booktitle=

Melody Zixuan Li and Kumar Krishna Agrawal and Arna Ghosh and Komal Kumar Teru and Adam Santoro and Guillaume Lajoie and Blake A. Richards , booktitle=
[33]

Pegah Khayatan and Mustafa Shukor and Jayneel Parekh and Arnaud Dapogny and Matthieu Cord , booktitle=
[34]

Xiaoyu Yang and Jie Lu and En Yu , booktitle=
[35]

Raghu, Maithra and Gilmer, Justin and Yosinski, Jason and Sohl-Dickstein, Jascha , booktitle =
[36]

Chapman and Hall/CRC, New York (1994)

Efron, Bradley and Tibshirani, R.J. , year =. doi:10.1201/9780429246593 , publisher =

work page doi:10.1201/9780429246593
[37]

Benjamin Erichson and Kush Bhatia and Michael W

Jerry Weihong Liu and N. Benjamin Erichson and Kush Bhatia and Michael W. Mahoney and Christopher Re , booktitle=
[38]

Jesse Mu and Jacob Andreas , year=
[39]

doi:10.1007/3-540-44668-0_68 , booktitle =

Venna, Jarkko and Kaski, Samuel , year =. doi:10.1007/3-540-44668-0_68 , booktitle =

work page doi:10.1007/3-540-44668-0_68
[40]

Advances in Neural Information Processing Systems , year=

Intrinsic dimension of data representations in deep neural networks , author=. Advances in Neural Information Processing Systems , year=
[41]

Nat Commun , author =

Cohen, Uri and Chung, SueYeon and Lee, Daniel D. and Sompolinsky, Haim , year =. Separability and geometry of object manifolds in deep neural networks , volume =. Nature Communications , publisher =. doi:10.1038/s41467-020-14578-5 , number =

work page doi:10.1038/s41467-020-14578-5
[42]

2004 , month = feb, publisher =

Ledoit, Olivier and Wolf, Michael , year =. A well-conditioned estimator for large-dimensional covariance matrices , volume =. Journal of Multivariate Analysis , publisher =. doi:10.1016/s0047-259x(03)00096-4 , number =

work page doi:10.1016/s0047-259x(03)00096-4
[43]

Predictive learning as a network mechanism for extracting low-dimensional latent space representations , volume =

Recanatesi, Stefano and Farrell, Matthew and Lajoie, Guillaume and Deneve, Sophie and Rigotti, Mattia and Shea-Brown, Eric , year =. Predictive learning as a network mechanism for extracting low-dimensional latent space representations , volume =. Nature Communications , publisher =. doi:10.1038/s41467-021-21696-1 , number =

work page doi:10.1038/s41467-021-21696-1
[44]

2026 , copyright =

Springer, Max and Lee, Chung Peng and Metevier, Blossom and Castleman, Jane and Turbal, Bohdan and Jung, Hayoung and Shen, Zeyu and Korolova, Aleksandra , keywords =. 2026 , copyright =. doi:10.48550/ARXIV.2602.15799 , url =

work page doi:10.48550/arxiv.2602.15799 2026
[45]

Thao Nguyen and Maithra Raghu and Simon Kornblith , booktitle=
[46]

Statistical analysis of shape

Dryden, I L and Mardia, K V. Statistical analysis of shape
[47]

Advances in Neural Information Processing Systems , year=

Jacot, Arthur and Gabriel, Franck and Hongler, Cl. Advances in Neural Information Processing Systems , year=
[48]

Journal of Machine Learning Research , articleno =

Naitzat, Gregory and Zhitnikov, Andrey and Lim, Lek-Heng , title =. Journal of Machine Learning Research , articleno =. 2020 , volume =

2020
[49]

Bansal, Yamini and Nakkiran, Preetum and Barak, Boaz , booktitle=
[50]

Lakshminarayanan, Balaji and Pritzel, Alexander and Blundell, Charles , booktitle=
[51]

H., Kipnis, A

Sch\". Statistical inference on representational geometries , volume =. doi:10.7554/elife.82566 , journal =

work page doi:10.7554/elife.82566
[52]

eLife , author =

Representational drift as a result of implicit regularization , DOI =. eLife , author =
[53]

Conference on Cognitive Computational Neuroscience , year=

Sch. Conference on Cognitive Computational Neuroscience , year=
[54]

Ananya Kumar and Aditi Raghunathan and Robbie Matthew Jones and Tengyu Ma and Percy Liang , booktitle=
[55]

Armen Aghajanyan and Luke Zettlemoyer and Sonal Gupta , booktitle=
[56]

and Ng, Andrew and Potts, Christopher

Socher, Richard and Perelygin, Alex and Wu, Jean and Chuang, Jason and Manning, Christopher D. and Ng, Andrew and Potts, Christopher. Empirical Methods in Natural Language Processing. 2013

2013
[57]

and Daly, Raymond E

Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher , title =. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics , year =
[58]

Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Williams, Adina and Nangia, Nikita and Bowman, Samuel. Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018

2018
[59]

and Angeli, Gabor and Potts, Christopher and Manning, Christopher D

Bowman, Samuel R. and Angeli, Gabor and Potts, Christopher and Manning, Christopher D. A large annotated corpus for learning natural language inference. Conference on Empirical Methods in Natural Language Processing. 2015

2015
[60]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing

Voita, Elena and Titov, Ivan. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020

2020
[61]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , author =

2020
[62]

Alexander Matt Turner and Lisa Thiergart and Gavin Leech and David Udell and Juan J Vazquez and Ulisse Mini and Monte MacDiarmid , year=
[63]

Byun and Zifan Wang and Alex Mallen and Steven Basart and Sanmi Koyejo and Dawn Song and Matt Fredrikson and Zico Kolter and Dan Hendrycks , journal=

Andy Zou and Long Phan and Sarah Chen and James Campbell and Phillip Guo and Richard Ren and Alexander Pan and Xuwang Yin and Mantas Mazeika and Ann-Kathrin Dombrowski and Shashwat Goel and Nathaniel Li and Michael J. Byun and Zifan Wang and Alex Mallen and Steven Basart and Sanmi Koyejo and Dawn Song and Matt Fredrikson and Zico Kolter and Dan Hendrycks ...
[64]

Zhang, Xiang and Zhao, Junbo and LeCun, Yann , booktitle =
[65]

and Daly, Raymond E

Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher , booktitle =. 2011

2011
[66]

and Leike, Jan and Brown, Tom B

Christiano, Paul F. and Leike, Jan and Brown, Tom B. and Martic, Miljan and Legg, Shane and Amodei, Dario , booktitle=
[67]

Cohen , abstract =

McCloskey, Michael and Cohen, Neal J. , year =. doi:10.1016/s0079-7421(08)60536-8 , booktitle =

work page doi:10.1016/s0079-7421(08)60536-8
[68]

Hinton and Oriol Vinyals and Jeffrey Dean , booktitle=

Geoffrey E. Hinton and Oriol Vinyals and Jeffrey Dean , booktitle=
[69]

Parisi and Ronald Kemker and Jose L

Parisi, German I. and Kemker, Ronald and Part, Jose L. and Kanan, Christopher and Wermter, Stefan , year =. Continual lifelong learning with neural networks: A review , volume =. doi:10.1016/j.neunet.2019.01.012 , journal =

work page doi:10.1016/j.neunet.2019.01.012 2019
[70]

1997 , pages =

Machine Learning , author =. 1997 , pages =. doi:10.1023/a:1007379606734 , number =

work page doi:10.1023/a:1007379606734 1997
[71]

arXiv , year=

Jason Phang and Thibault F. arXiv , year=
[72]

Elgood, Heather Mary , publisher=
[73]

1946 , publisher=

Zimmer, Heinrich and Campbell, Joseph , series=. 1946 , publisher=

1946
[74]

1991 , publisher=

Dani. 1991 , publisher=

1991
[75]

Vogel, Jean Philippe , year=
[76]

Dimmitt, Cornelia and van Buitenen, Johannes Adrianus Bernardus , year=
[77]

1964 , month = mar, publisher=

Dani. 1964 , month = mar, publisher=

1964
[78]

https://doi.org/10.1371/journal.pcbi.1003553

PLoS Computational Biology , author =. 2014 , pages =. doi:10.1371/journal.pcbi.1003553 , number =

work page doi:10.1371/journal.pcbi.1003553 2014
[79]

Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 2 , year =

Chang, Ming-Wei and Ratinov, Lev and Roth, Dan and Srikumar, Vivek , title =. Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 2 , year =
[80]

Jacob Devlin and Ming-Wei Chang and Kenton Lee and Kristina Toutanova , booktitle=
[81]

Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and Luke Zettlemoyer and Veselin Stoyanov , journal=

Showing first 80 references.