Aristotelian Virtue Profiling of LLMs through Ethical Dilemmas

Ioannis Tzachristas; John Pavlopoulos

arxiv: 2606.28683 · v1 · pith:UGBNHQ2Enew · submitted 2026-06-27 · 💻 cs.AI · cs.CY· stat.AP· stat.ME

Aristotelian Virtue Profiling of LLMs through Ethical Dilemmas

Ioannis Tzachristas , John Pavlopoulos This is my paper

Pith reviewed 2026-06-30 10:16 UTC · model grok-4.3

classification 💻 cs.AI cs.CYstat.APstat.ME

keywords VirtueMapAristotelian virtuesLLM ethicsethical dilemmasvirtue profilingBorda alignmentAI evaluation

0 comments

The pith

VirtueMap profiles LLMs on Aristotelian virtues by scoring their rankings of dilemma responses against human-validated orderings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces VirtueMap to capture how LLMs navigate ethical tradeoffs by expressing different priorities among Practical Wisdom, Justice, Truthfulness, Courage, and Temperance. For each of seven dilemmas, it asks for full rankings of five responses rather than a single answer, then scores those rankings with normalized Borda alignment against reference orderings that were retained only after 95 percent confirmation from over 100 human respondents. When applied to nine LLM families in repeated runs, the method produces virtue profiles with 90.3 percent mean rank consistency. The largest differences between families appear on Courage, Temperance, and Justice. The work also releases a browser-based site for computing and comparing such profiles locally.

Core claim

VirtueMap is a framework that asks both humans and LLMs to rank all five responses to each of seven ethical dilemmas according to five Aristotelian virtues. Reference orderings are first proposed and then kept as ground truth only when at least 95 percent of more than 100 human respondents confirm them. Rankings are scored via normalized Borda alignment to yield profiles over Practical Wisdom, Justice, Truthfulness, Courage, and Temperance. Application to nine LLM families in repeated-run evaluation shows 90.3 percent mean rank consistency, with the largest differences on Courage, Temperance, and Justice.

What carries the argument

VirtueMap, a ranking framework that scores LLM responses against 95-percent-human-confirmed Aristotelian virtue orderings using normalized Borda alignment.

If this is right

LLM families produce stable virtue profiles across repeated evaluations.
The greatest variation across families occurs in Courage, Temperance, and Justice.
Profiles can be compared directly between models and with human respondents using the released interactive website.
The method allows measurement of relative virtue priorities without requiring a single correct answer.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The observed consistency may reflect shared patterns in the training data of different model families.
The framework could track how additional training or alignment steps alter a model's virtue profile over successive versions.
Applying the same dilemmas to human respondents at scale would allow direct statistical comparison between human and LLM distributions on each virtue.

Load-bearing premise

The initially proposed orderings of the five responses for each dilemma and virtue accurately reflect Aristotelian virtue expression once confirmed by at least 95 percent of human respondents.

What would settle it

A new survey of over 100 respondents per ordering that finds agreement below 95 percent on any reference ranking, or repeated LLM evaluations that yield mean rank consistency substantially below 90.3 percent.

Figures

Figures reproduced from arXiv: 2606.28683 by Ioannis Tzachristas, John Pavlopoulos.

**Figure 1.** Figure 1: VirtueMap workflow. Dilemmas and response rankings are grounded through common-sense validation of virtue-expression orderings, then scored using normalized Borda alignment. Repeated LLM runs estimate model profiles and stability. Scores are computed on 0–100; pentagon plots use a 40–100 visual zoom only for readability. reversal. The normalized virtue percentage is S(d, v) = 100 · A(d, v) − 35 55 − 35 . (… view at source ↗

**Figure 2.** Figure 2: Pentagon profiles for the nine measured LLM families. Scores are computed on 0–100 and [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Large Language Models (LLMs) often face ethical tradeoffs in which several responses may be defensible but express different priorities, such as fairness, honesty, courage, or restraint. We introduce VirtueMap, a framework for describing these patterns through an Aristotelian virtue-ethics lens. Instead of asking for a single correct answer, VirtueMap asks humans or LLMs to rank all five responses to each of seven general, non-lethal, non-political, and non-religious ethical dilemmas. To define the reference orderings used for scoring, we first proposed, for each dilemma and virtue, an ordering of the five responses from most to least expressive of that virtue. We then collected more than 100 respondent evaluations per ordering and retained it as operational ground truth only when at least 95% confirmed it. Rankings are scored against these retained orderings using normalized Borda alignment, yielding profiles over Practical Wisdom, Justice, Truthfulness, Courage, and Temperance. We apply VirtueMap to nine LLM families in a repeated-run evaluation and find high mean rank consistency (90.3%), with the largest differences appearing on Courage, Temperance, and Justice. We also release an interactive website that computes profiles locally in the browser and compares respondents with measured LLM profiles.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VirtueMap gives a workable ranking method for LLM ethical profiles with solid consistency numbers, but the link to actual Aristotelian virtues is mostly author proposal plus crowd agreement rather than textual grounding.

read the letter

The paper's main contribution is VirtueMap: a framework that has LLMs and humans rank five responses to each of seven everyday ethical dilemmas, then scores the rankings against reference orderings using normalized Borda counts. They validate the references by proposing an ordering per dilemma and virtue, collecting over 100 human judgments, and keeping only those with 95% agreement. Applied to nine LLM families, they report 90.3% mean rank consistency across repeated runs, with bigger spreads on Courage, Temperance, and Justice. They also ship a browser-based tool for local comparisons.

This is genuinely new as an empirical profiling technique. The repeated-run design and the decision to use full rankings instead of single answers are practical improvements over typical multiple-choice ethics tests. The 95% retention rule and the release of the site are concrete steps that make the method usable by others.

The soft spot is the Aristotelian part. The orderings begin with the authors' proposals; the human check only confirms that most respondents match those proposals. There is no derivation from the Nicomachean Ethics, no expert philosopher review, and no check that the five-response sets actually capture the relevant means and vices. If the proposals embed modern or author-specific intuitions, the resulting profiles measure alignment with those intuitions, not Aristotelian virtue. The dilemmas are deliberately non-lethal and non-political, which keeps the experiment clean but also limits how far the profiles can speak to real deployment questions.

This is worth sending to referees. The method is clear enough to evaluate, the consistency result is falsifiable, and the limitation on the virtue grounding is fixable with additional validation steps. Readers working on alignment benchmarks or ethics evaluation tooling will get the most out of it.

Referee Report

2 major / 2 minor

Summary. The paper introduces VirtueMap, a framework that profiles LLMs on five Aristotelian virtues (Practical Wisdom, Justice, Truthfulness, Courage, Temperance) by requiring models to rank all five responses to each of seven ethical dilemmas. Reference orderings are created by author proposal and retained only if confirmed by ≥95% of >100 human respondents; LLM outputs are scored via normalized Borda alignment. Applied to nine LLM families in repeated runs, the work reports 90.3% mean rank consistency, largest differences on Courage/Temperance/Justice, and releases a browser-based interactive tool for local profile computation.

Significance. If the retained orderings validly operationalize the virtues, the method supplies a reproducible, multi-response ranking approach to ethical evaluation that avoids single-answer assumptions and yields stable profiles across model families. The repeated-run design, Borda scoring, and open interactive website are concrete strengths that support reproducibility. The reported consistency statistic would be a useful benchmark for alignment work if the Aristotelian grounding holds.

major comments (2)

[Abstract / reference-ordering section] Abstract and the section describing reference-ordering construction: the claim that retained orderings constitute 'operational ground truth' for Aristotelian virtues rests on ≥95% human confirmation of author-proposed rankings. This procedure measures inter-rater agreement with the proposals, not correctness against the Nicomachean Ethics (e.g., courage as the mean between fear and confidence, or justice as giving each their due). No derivation from primary texts, no expert-philosopher validation, and no check that the five-response sets instantiate the relevant means/vices is described. Because the downstream Borda profiles and 90.3% consistency claim are scored against these orderings, the issue is load-bearing for the central 'Aristotelian virtue profiling' assertion.
[Dilemma and virtue mapping section] The seven dilemmas are presented as instantiating the five virtues, yet the manuscript provides no explicit mapping showing how each dilemma's response set realizes the Aristotelian mean/vice structure for the targeted virtue. Without this, it is unclear whether differences in LLM rankings reflect virtue expression or merely surface-level response preferences.

minor comments (2)

[Abstract] The abstract states 'more than 100 respondent evaluations per ordering' but does not report exact N, demographic details, or inter-rater reliability beyond the 95% threshold; adding these would improve transparency.
[Scoring section] Notation for normalized Borda alignment should be defined explicitly (e.g., formula and range) rather than left implicit.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. We address each major comment below and propose revisions where appropriate to strengthen the manuscript's clarity regarding its methodological foundations.

read point-by-point responses

Referee: [Abstract / reference-ordering section] the claim that retained orderings constitute 'operational ground truth' for Aristotelian virtues rests on ≥95% human confirmation of author-proposed rankings. This procedure measures inter-rater agreement with the proposals, not correctness against the Nicomachean Ethics. No derivation from primary texts, no expert-philosopher validation, and no check that the five-response sets instantiate the relevant means/vices is described. The issue is load-bearing for the central 'Aristotelian virtue profiling' assertion.

Authors: We agree that the reference orderings originate from author proposals validated by high human agreement rather than derivation from the Nicomachean Ethics or expert-philosopher review. The phrase 'operational ground truth' was meant to denote a reproducible empirical benchmark but can overstate the philosophical grounding. In revision we will replace it with 'human-validated reference orderings' and add a limitations subsection explicitly noting the absence of direct textual derivation or expert validation. This preserves the empirical contribution while accurately representing the method's scope. revision: yes
Referee: [Dilemma and virtue mapping section] The seven dilemmas are presented as instantiating the five virtues, yet the manuscript provides no explicit mapping showing how each dilemma's response set realizes the Aristotelian mean/vice structure for the targeted virtue. Without this, it is unclear whether differences in LLM rankings reflect virtue expression or merely surface-level response preferences.

Authors: We acknowledge that the manuscript does not supply explicit per-dilemma mappings linking response sets to the Aristotelian mean/vice structure. The dilemmas were constructed to target the virtues via author interpretation, but this structure was not documented. We will add an appendix providing these mappings for all seven dilemmas, with brief justifications for how each response set spans the relevant mean. This addition will clarify the intended connection to virtue expression. revision: yes

Circularity Check

0 steps flagged

No significant circularity; external human validation anchors the reference orderings

full rationale

The paper defines reference orderings by first proposing them and then retaining only those confirmed by ≥95% of >100 independent human respondents per ordering. LLM rankings are then scored against these externally validated orderings via normalized Borda alignment. No step reduces the reported LLM virtue profiles or consistency statistics (e.g., 90.3% mean rank consistency) to quantities fitted from the LLM data itself, nor does any load-bearing premise rest on self-citation chains or self-definitional loops. The human confirmation step functions as an independent benchmark rather than an internal fit.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on two domain assumptions about virtue orderings and one ad-hoc threshold; no free parameters or invented entities are introduced.

axioms (2)

domain assumption The authors' initial proposed orderings of responses correctly capture the degree to which each response expresses a given Aristotelian virtue.
Invoked when defining reference orderings before human validation (abstract).
ad hoc to paper A 95% confirmation rate by respondents is sufficient to establish operational ground truth for the orderings.
Chosen threshold for retaining orderings (abstract).

pith-pipeline@v0.9.1-grok · 5756 in / 1512 out tokens · 47090 ms · 2026-06-30T10:16:01.530653+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 2 canonical work pages

[1]

1999 , url=

Nicomachean Ethics , author=. 1999 , url=

1999
[2]

1999 , url=

On Virtue Ethics , author=. 1999 , url=

1999
[3]

2011 , url=

Intelligent Virtue , author=. 2011 , url=

2011
[4]

2021 , url=

Dan Hendrycks and Collin Burns and Steven Basart and Andrew Critch and Jerry Li and Dawn Song and Jacob Steinhardt , booktitle=. 2021 , url=

2021
[5]

2021 , eprint=

Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes , author=. 2021 , eprint=

2021
[6]

, title =

Scherrer, Nino and Shi, Claudia and Feder, Amir and Blei, David M. , title =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =. 2023 , publisher =

2023
[7]

2310.15337 , archivePrefix=

Marwa Abdulhai and Gregory Serapio-Garcia and Clément Crepy and Daria Valter and John Canny and Natasha Jaques , year=. 2310.15337 , archivePrefix=

work page arXiv
[8]

2025 , eprint=

Modeling Annotator Disagreement with Demographic-Aware Experts and Synthetic Perspectives , author=. 2025 , eprint=

2025
[9]

Polarized Opinion Detection Improves the Detection of Toxic Language

Pavlopoulos, John and Likas, Aristidis. Polarized Opinion Detection Improves the Detection of Toxic Language. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.eacl-long.117

work page doi:10.18653/v1/2024.eacl-long.117 2024
[10]

Can Robotic

Constantinescu, Mihaela and Crisp, Roger , journal=. Can Robotic. 2022 , doi=

2022
[11]

The Stanford Encyclopedia of Philosophy , editor=

Aristotle's Ethics , author=. The Stanford Encyclopedia of Philosophy , editor=. 2021 , url=

2021
[12]

2026 , howpublished =

OpenRouter API Reference: Chat Completion , author =. 2026 , howpublished =

2026

[1] [1]

1999 , url=

Nicomachean Ethics , author=. 1999 , url=

1999

[2] [2]

1999 , url=

On Virtue Ethics , author=. 1999 , url=

1999

[3] [3]

2011 , url=

Intelligent Virtue , author=. 2011 , url=

2011

[4] [4]

2021 , url=

Dan Hendrycks and Collin Burns and Steven Basart and Andrew Critch and Jerry Li and Dawn Song and Jacob Steinhardt , booktitle=. 2021 , url=

2021

[5] [5]

2021 , eprint=

Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes , author=. 2021 , eprint=

2021

[6] [6]

, title =

Scherrer, Nino and Shi, Claudia and Feder, Amir and Blei, David M. , title =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =. 2023 , publisher =

2023

[7] [7]

2310.15337 , archivePrefix=

Marwa Abdulhai and Gregory Serapio-Garcia and Clément Crepy and Daria Valter and John Canny and Natasha Jaques , year=. 2310.15337 , archivePrefix=

work page arXiv

[8] [8]

2025 , eprint=

Modeling Annotator Disagreement with Demographic-Aware Experts and Synthetic Perspectives , author=. 2025 , eprint=

2025

[9] [9]

Polarized Opinion Detection Improves the Detection of Toxic Language

Pavlopoulos, John and Likas, Aristidis. Polarized Opinion Detection Improves the Detection of Toxic Language. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.eacl-long.117

work page doi:10.18653/v1/2024.eacl-long.117 2024

[10] [10]

Can Robotic

Constantinescu, Mihaela and Crisp, Roger , journal=. Can Robotic. 2022 , doi=

2022

[11] [11]

The Stanford Encyclopedia of Philosophy , editor=

Aristotle's Ethics , author=. The Stanford Encyclopedia of Philosophy , editor=. 2021 , url=

2021

[12] [12]

2026 , howpublished =

OpenRouter API Reference: Chat Completion , author =. 2026 , howpublished =

2026