Recognition: no theorem link
Alignment as Jurisprudence
Pith reviewed 2026-05-12 01:41 UTC · model grok-4.3
The pith
Jurisprudence and AI alignment share a fundamental structure that lets each field improve the other.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Jurisprudence and alignment both seek to predict and shape how decisions by powerful actors—judges and increasingly capable AIs—will be made in the unknown future, relying on the same tools of language specification and interpretation. Leading jurisprudential accounts, such as Dworkin's interpretivism and Sunstein's account of law as analogical reasoning, map onto alignment methods like Constitutional AI and case-based reasoning. The conversation between the fields can refine how rules and cases interact in finetuning, help AI systems empower human capabilities, and provide new ways to understand and improve legal processes as AI capacity grows.
What carries the argument
The structural analogy that treats both judicial decision-making under incomplete rules and AI value alignment as problems of specifying and interpreting language to constrain future powerful actors.
If this is right
- Dworkin's interpretivism supplies a method for weighting principles when rules conflict in AI training.
- Sunstein's analogical reasoning offers a template for case-based retrieval that improves alignment consistency.
- AI systems can simulate and analyze large sets of judicial interpretations to reveal patterns in legal reasoning.
- Both AI and law should be designed to expand human capabilities rather than merely constrain behavior.
- As AI decision-making capacity increases, jurisprudential limits on how rules are interpreted become directly relevant to AI governance.
Where Pith is reading between the lines
- The same analogy could be tested in corporate governance or regulatory agencies where rules must guide powerful non-human decision systems.
- AI models trained on legal corpora might generate novel test cases that expose weaknesses in existing jurisprudential theories.
- Cross-training experiments could measure whether legal-theory-inspired prompts measurably reduce value drift in long-horizon AI tasks.
Load-bearing premise
The structural similarities between judicial decision-making and AI alignment are deep enough that specific theories from one field transfer usefully to the other without substantial distortion.
What would settle it
An experiment in which Dworkin-style principle weighting or Sunstein-style analogical retrieval is added to an alignment pipeline and produces no measurable gain in value adherence or robustness compared with standard Constitutional AI on the same benchmarks.
read the original abstract
Jurisprudence, the study of how judges should properly decide cases, and alignment, the science of getting AI models to conform to human values, share a fundamental structure. These seemingly distant fields both seek to predict and shape how decisions by powerful actors, in one case judges and in the other increasingly powerful artificial intelligences, will be made in the unknown future. And they use similar tools of the specification and interpretation of language to try to accomplish those goals. The great debates of jurisprudence, about what the law is and what it should be, can provide insight into alignment, and lessons from what does and does not work in alignment can help make progress in jurisprudence. This essay puts the two fields directly into conversation. Drawing on leading accounts of jurisprudence, particularly Dworkin's principle-oriented interpretivism and Sunstein's positivist account of law as analogical reasoning, and on cutting-edge alignment approaches, namely Constitutional AI and case-based reasoning, it illustrates the value of a more sophisticated legally-inspired approach to the interplay of rules and cases in finetuning alignment and points to ways that AI can provide a better understanding of how the law works and how it can be improved by the introduction of AI. AI systems and the law should operate to empower people to act in the world, helping to expand their capabilities and the extent to which they are able to achieve their goals. As AI continues to improve in capacity, and as the constraints that legal theory places on human judges seem be coming undone, the conversation between these two fields will become increasingly essential and may help point to a better version of both.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that jurisprudence and AI alignment share a fundamental structure: both fields use the specification and interpretation of language to predict and shape the future decisions of powerful actors (judges or AI systems) under uncertainty. Drawing on Dworkin's principle-oriented interpretivism and Sunstein's account of law as analogical reasoning, alongside alignment techniques such as Constitutional AI and case-based reasoning, the essay argues that cross-pollination can yield insights for finetuning alignment methods and for improving legal theory, ultimately empowering human capabilities.
Significance. If the structural parallels prove robust, the work could supply a richer conceptual toolkit for alignment by importing jurisprudential mechanisms for handling incomplete rules and precedent, while offering AI as a testbed for legal reasoning. The interdisciplinary framing is a strength, as it identifies shared goals without reducing one domain to the other.
major comments (2)
- [Abstract and the discussion of Dworkin/Sunstein parallels] The central claim that Dworkin's interpretivism and Sunstein's analogical reasoning supply transferable tools for alignment (e.g., Constitutional AI) rests on shared goals and methods but lacks an explicit operational mapping. No section shows how 'law as integrity' would be encoded as a training objective, retrieval mechanism, or consistency constraint inside an LLM fine-tuning loop.
- [Section on mutual insights between jurisprudence and alignment] The manuscript does not address key disanalogies that could undermine transfer, such as the absence of institutional precedent enforcement, reciprocal accountability, or human oversight mechanisms in stateless model inference. These are load-bearing for the assertion that the theories 'will transfer usefully' without substantial distortion.
minor comments (2)
- [Abstract] The abstract states that 'AI can provide a better understanding of how the law works' but does not specify any concrete mechanism or example by which AI would improve jurisprudence.
- [Introduction to alignment approaches] Terminology such as 'case-based reasoning' in the alignment context is used without a precise definition or citation to the relevant alignment literature.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed report. The comments correctly identify areas where our conceptual essay could be strengthened by greater explicitness about operational implications and disanalogies. We address each point below and commit to revisions that preserve the paper's interdisciplinary, non-technical character while improving clarity and rigor.
read point-by-point responses
-
Referee: [Abstract and the discussion of Dworkin/Sunstein parallels] The central claim that Dworkin's interpretivism and Sunstein's analogical reasoning supply transferable tools for alignment (e.g., Constitutional AI) rests on shared goals and methods but lacks an explicit operational mapping. No section shows how 'law as integrity' would be encoded as a training objective, retrieval mechanism, or consistency constraint inside an LLM fine-tuning loop.
Authors: We agree that the manuscript remains at the level of structural analogy rather than supplying concrete implementation details. The essay's purpose is to surface jurisprudential mechanisms that could inform alignment research, not to deliver a ready-to-code fine-tuning recipe. To respond to this concern, we will add a dedicated subsection (approximately one page) that sketches plausible operational translations without claiming they are fully engineered. For instance, we will discuss how Dworkin's 'law as integrity' could be approximated via a multi-objective loss that penalizes both rule violation and inconsistency with a reconstructed 'best interpretation' of prior outputs, and how Sunstein-style analogical reasoning maps onto retrieval-augmented case-based alignment pipelines. These additions will be framed as directions for future work rather than completed mappings. revision: yes
-
Referee: [Section on mutual insights between jurisprudence and alignment] The manuscript does not address key disanalogies that could undermine transfer, such as the absence of institutional precedent enforcement, reciprocal accountability, or human oversight mechanisms in stateless model inference. These are load-bearing for the assertion that the theories 'will transfer usefully' without substantial distortion.
Authors: The referee rightly notes that the current text does not systematically catalogue disanalogies. While the manuscript emphasizes shared problems of specification and interpretation under uncertainty, it does not explicitly weigh how the lack of enforceable precedent or reciprocal accountability in stateless inference might limit direct transfer. In revision we will expand the mutual-insights section with a new paragraph that (a) enumerates the three disanalogies mentioned, (b) explains why they matter, and (c) argues that the core jurisprudential tools remain relevant precisely because AI systems lack those institutional safeguards—making explicit consistency mechanisms (e.g., Constitutional AI self-critique) even more necessary. We will qualify our claims to state that transfer is likely to require adaptation rather than wholesale importation. revision: yes
Circularity Check
No circularity: conceptual analogy without self-referential derivation or fitted inputs
full rationale
The paper is a purely conceptual essay that draws structural parallels between jurisprudence (Dworkin interpretivism, Sunstein analogical reasoning) and AI alignment (Constitutional AI, case-based reasoning) at the level of shared goals in language-based specification of constraints on powerful decision-makers. It advances no equations, no fitted parameters, no quantitative predictions, and no derivation chain that reduces outputs to inputs by construction. All cited sources (Dworkin, Sunstein, Constitutional AI) are external to the author; no self-citation is load-bearing. The argument consists of open cross-domain mapping and suggestion of mutual insight rather than any claim that a result is forced by prior definitions or fits within the paper itself. This is the normal case of a non-circular interdisciplinary essay.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The structure of judicial decision-making under incomplete specifications is sufficiently analogous to AI value alignment to allow useful transfer of methods.
- domain assumption Leading accounts such as Dworkin's interpretivism and Sunstein's analogical reasoning capture the core challenges of alignment.
Reference graph
Works this paper leans on
-
[1]
212 Sunstein, supra note 60 at 31–32. 213 274 U.S. 200 (1927). 214 Sunstein, supra note 58 at
work page 1927
-
[2]
25, 2023), https://openai.com/blog/chatgpt-can-now-see-hear-and-speak
222 ChatGPT can now see, hear, and speak, OPENAI (Sep. 25, 2023), https://openai.com/blog/chatgpt-can-now-see-hear-and-speak. 223 Wolfram, supra note 7 at
work page 2023
-
[3]
Language Models are Few-Shot Learners
224 See supra Part I. 225 For example, one regularly suggested tip for using consumer models like ChatGPT is to “few-shot” it, to give it a few examples of the kind of reasoning that the user is asking for, in order to get it in the mode of that kind of reasoning before the user asks the question that she is seeking to get an answer to. See Tom B. Brown e...
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[4]
227 As opposed to the more “statutory” level of current Constitutional AI principles that all exist at the same level of importance.” 228 Insofar as they are distinct. See generally RONALD DWORKIN, JUSTICE FOR HEDGEHOGS (2011); JOHN RAWLS, A THEORY OF JUSTICE (1971). Electronic copy available at: https://ssrn.com/abstract=4800894 35 of models in the futur...
work page 2011
-
[5]
245 Id. (note that the values of Group B were not included in the collective constitution). 246 West Virginia State Bd. of Educ. v. Barnette, 319 U.S. 624, 638 (1943) (“The very purpose of a Bill of Rights was to withdraw certain subjects from the vicissitudes of political controversy, to place them beyond the reach of majorities”). Electronic copy availa...
work page 1943
-
[6]
248 Id. at 256–58. 249 This is the picture of changing constitutional law by amendment. See Office of the Federal Register (OFR), Constitutional Amendment Process, NATIONAL ARCHIVES (Aug. 15, 2016), https://www.archives.gov/federal-register/constitution. Of course, constitutional law is also (or, these days, perhaps only) made by the Supreme Court, though...
work page 2016
-
[7]
In some sense, these meta-principles resemble Hart’s “secondary rules.” See Hart, supra note 1 at 93-99. Electronic copy available at: https://ssrn.com/abstract=4800894 39 adjudicating cases. Foundation models’ ability to reason contextually will allow them to apply those meta-principles effectively in different cases, deciding whether some circumstance r...
work page 1998
-
[8]
260 See id. at 228-38. 261 Id. 262 Dworkin, supra note 3 at 70–71. Cf. Lawrence Lessig, Understanding Changed Readings: Fidelity and Theory, 47 STAN. L. REV. 395, 402–07 (1995) (arguing that changed readings of constitutional text do not necessarily imply that the underlying meaning of the texts have changed). 263 For example, some might see taking off a ...
work page 1995
-
[9]
Emergent misalignment: Narrow finetuning can produce broadly misaligned LLMs, 2025
266 Dworkin, supra note 3 at 257–58. 267 The extent to which Anthropic’s models seem to be able to generalize and apply a concept like “good for humanity” supports such a claim, see Kundu et al., supra note 112 at 24–25, as does recent evidence that models have relatively robust conceptions of “evil” such that training them to be more malignant in one con...
-
[10]
280 See supra Part III Section A Subsection ii. 281 See Jonathan H. Choi, Measuring Clarity in Legal Text, 91 U. CHI. L. REV. 1, 30–49 (2024) (applying cosine difference measures based on word embeddings to analyze legal problems like the clarity and indeterminacy of text); Yonathan A. Arbel & David A. Hoffman, Generative Interpretation, 99 N.Y .U. L. REV...
work page 2024
-
[11]
(using large language models to analyze the language of contracts and generate predictions about what the parties intended them to mean ex ante). 282 Snell v. United Specialty Ins. Co., 2024 U.S. App. LEXIS 12733 *; _ F.4th _ (11th Cir., 05/28/24) (Newsom, J., concurring). 283 See Linardatos, Papastefanopoulos, & Kotsiantis, supra note 30 at 19–20. 284 Se...
work page 2024
-
[12]
288 Sunstein, supra note 59 at 1735–36
287 See generally, JOHN RAWLS, A THEORY OF JUSTICE (1971). 288 Sunstein, supra note 59 at 1735–36. 289 See Pratap Chandra Sen, Mahimarnab Hajra, & Mitadru Ghosh, Supervised Classification Algorithms in Machine Learning: A Survey and Review, in EMERGING TECHNOLOGY IN MODELLING AND GRAPHICS 99, 100 (Jyotsna Kumar Mandal & Debika Bhattacharya eds.,
work page 1971
-
[13]
Electronic copy available at: https://ssrn.com/abstract=4800894 48 holdings.290 But the encoding is flexible and fuzzy, and, during lawsuits, the meaning of the case will often be disputed through arguments that use the facts of the case to try to give different contents to the holding. This process is at the core of legal reasoning: an earlier decision f...
work page 1919
-
[14]
294 Cf. Snell, 2024 U.S. App. LEXIS 12733 *37 (Newsom, J., concurring) (arguing that LLMs might be more reliable sources of meaning than humans because they do not rely on manipulated inputs or have political objectives). 295 See Leiter, supra note 14 at
work page 2024
-
[15]
305 LUDWIG WITTGENSTEIN, PHILOSOPHICAL INVESTIGATIONS 8e (1953)
304 Dworkin, supra note 3 at 63 (citing Wittgenstein, though it is unclear from the text what the exact source of the paraphrase is). 305 LUDWIG WITTGENSTEIN, PHILOSOPHICAL INVESTIGATIONS 8e (1953). 306 See Kolt, supra note
work page 1953
-
[16]
inevitably evaluative, value-driven character,
307 Id. 308 See Maha Riad, Vinicius Renan de Carvalho, & Fatemeh Golpayegani, Multi-Value Alignment in Normative Multi-Agent System: Evolutionary Optimisation Approach, ARXIV (May 12, 2023), https://arxiv.org/abs/2305.07366; Edmund Dable-Heath, Boyko V odenicharski, James Bishop, On Corrigibility and Alignment in Multi Agent Games, ARXIV (Jan. 9, 2025), h...
-
[17]
at 32 (quoting Ronald Dworkin, In Praise of Theory, 29 ARIZ
319 Id. at 32 (quoting Ronald Dworkin, In Praise of Theory, 29 ARIZ. ST. L.J. 353, 371 (1997)). 320 Id. at
work page 1997
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.