Counting Worlds Branching Time Semantics for post-hoc Bias Mitigation in generative AI
Pith reviewed 2026-05-10 01:38 UTC · model grok-4.3
The pith
A new logic counts possible AI outputs as worlds to verify and fix bias in generations
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CTLF adopts a counting worlds semantics where each world represents a possible output at a given step in the generation process and introduces modal operators that allow us to verify whether the current output series respects an intended probability distribution over a protected attribute, to predict the likelihood of remaining within acceptable bounds as new outputs are generated, and to determine how many outputs are needed to remove in order to restore fairness. The framework is illustrated on a toy example of biased image generation.
What carries the argument
Counting worlds semantics for branching-time logic, using modal operators to verify probability distributions, predict bound adherence, and compute removals for fairness restoration.
Load-bearing premise
The actual generative AI process can be modeled faithfully as a branching-time structure with countable worlds at each step and that the modal operators accurately capture and enforce the intended fairness properties outside of toy examples.
What would settle it
An empirical test where CTLF is applied to outputs from a real generative model, and either the fairness verification fails to match observed data or the recommended removals do not achieve the target distribution.
Figures
read the original abstract
Generative AI systems are known to amplify biases present in their training data. While several inference-time mitigation strategies have been proposed, they remain largely empirical and lack formal guarantees. In this paper we introduce CTLF, a branching-time logic designed to reason about bias in series of generative AI outputs. CTLF adopts a counting worlds semantics where each world represents a possible output at a given step in the generation process and introduces modal operators that allow us to verify whether the current output series respects an intended probability distribution over a protected attribute, to predict the likelihood of remaining within acceptable bounds as new outputs are generated, and to determine how many outputs are needed to remove in order to restore fairness. We illustrate the framework on a toy example of biased image generation, showing how CTLF formulas can express concrete fairness properties at different points in the output series.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CTLF, a branching-time logic employing counting-worlds semantics in which each world represents a possible generative output at a given step. It defines modal operators intended to verify whether an output series respects a target probability distribution over a protected attribute, to predict the likelihood of remaining within fairness bounds as generation continues, and to compute the number of outputs that must be removed to restore compliance. These capabilities are illustrated via a toy example of biased image generation.
Significance. If the semantics and operators can be shown to correctly encode the intended fairness properties with respect to the actual (non-uniform) generative distribution, the framework would supply the first formal, post-hoc verification tool for bias mitigation in sequential generative outputs, replacing purely empirical strategies with checkable guarantees.
major comments (2)
- [Abstract / Semantics] Abstract and semantics section: the counting-worlds semantics is described as using cardinality ratios to evaluate probability operators, yet generative models induce non-uniform measures (softmax, diffusion schedules, etc.). No reduction to the model's actual probability measure is indicated, so the operators cannot correctly verify or predict respect for an intended distribution; this is load-bearing for all three claimed capabilities.
- [Abstract] Abstract: the manuscript states that CTLF supplies modal operators with the listed verification, prediction, and removal properties but provides neither the syntax, the satisfaction relation, nor any proof that the operators achieve these properties. Without these definitions the central claims cannot be checked.
minor comments (1)
- [Toy example] The toy example would be clearer if it included explicit CTLF formulas together with the step-by-step evaluation under the counting-worlds semantics.
Simulated Author's Rebuttal
We thank the referee for the thorough review and for identifying key areas where the formal foundations of CTLF require clarification and strengthening. We address each major comment below and outline the revisions we will undertake.
read point-by-point responses
-
Referee: [Abstract / Semantics] Abstract and semantics section: the counting-worlds semantics is described as using cardinality ratios to evaluate probability operators, yet generative models induce non-uniform measures (softmax, diffusion schedules, etc.). No reduction to the model's actual probability measure is indicated, so the operators cannot correctly verify or predict respect for an intended distribution; this is load-bearing for all three claimed capabilities.
Authors: We agree that this distinction is essential. The submitted manuscript defines the counting-worlds semantics via unweighted cardinality ratios, which corresponds to a uniform measure. To support the non-uniform distributions produced by generative models, we will revise the semantics section to introduce a weighted counting-worlds model. Each world will be equipped with a probability weight taken directly from the generative process (e.g., token probabilities or diffusion-step likelihoods). The modal operators will then be re-defined with respect to this weighted measure, and we will supply a formal reduction together with a proof that the verification, prediction, and removal operators correctly encode the intended fairness properties under the actual distribution. The toy example will be updated to use non-uniform probabilities. These changes will be incorporated in the revised manuscript. revision: yes
-
Referee: [Abstract] Abstract: the manuscript states that CTLF supplies modal operators with the listed verification, prediction, and removal properties but provides neither the syntax, the satisfaction relation, nor any proof that the operators achieve these properties. Without these definitions the central claims cannot be checked.
Authors: We acknowledge that the submitted version does not present the syntax, satisfaction relation, or proofs in sufficient detail for the claims to be independently verified. In the revision we will add a dedicated subsection (or expand Section 3) that gives the full syntax of CTLF, defines the satisfaction relation for each modal operator, and includes complete proofs that the operators realize the stated verification, prediction, and removal capabilities under the (revised, weighted) semantics. A brief summary of the syntax and satisfaction clauses will also be added to the abstract to guide readers to the formal material. revision: yes
Circularity Check
No circularity: CTLF is a definitional semantics for fairness properties
full rationale
The paper introduces CTLF as a new branching-time logic whose counting-worlds semantics and modal operators are explicitly defined to capture verification of probability distributions over protected attributes, prediction of future compliance, and computation of output removals needed for fairness. This is a standard definitional construction of a formal system rather than any derivation that reduces to its own fitted inputs or self-referential equations. No load-bearing self-citations, no parameters fitted to data then renamed as predictions, and no uniqueness theorems imported from prior author work appear in the provided abstract or description. The toy example simply illustrates the defined operators. The framework is self-contained against external benchmarks as a proposed logic; any mismatch with non-uniform generative probabilities is an assumption or modeling question, not a circular reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Generative AI output series can be modeled as branching time structures with countable worlds at each step.
invented entities (1)
-
CTLF logic with counting worlds semantics and modal operators
no independent evidence
Reference graph
Works this paper leans on
-
[1]
M. Antonelli, L. Ceragioli, A. Buda, and G. Primiero. A linear temporal logic of frequencies on series of events, 2026
work page 2026
-
[2]
M. Antonelli, U. Dal Lago, and P. Pistone. On counting propositional logic and Wag- ner’s hierarchy.Theoretical Computer Science, 966-967, 2023
work page 2023
-
[3]
M. Antonelli, U. Dal Lago, and P. Pistone. Towards logical foundations for probabilistic computation.Annals of Pure and Applied Logic, 175(9), 2024
work page 2024
-
[4]
S. G. Ayyamperumal and L. Ge. Current state of LLM risks and AI guardrails. 2024
work page 2024
-
[5]
F. Bacchus.Representing and Reasoning with Probabilistic Knowledge A Logical Ap- proach to Probabilities. The MIT Press, 1990
work page 1990
-
[6]
S. L. Blodgett et al. Bias and fairness in large language models: A survey.Computa- tional Linguistics, 2024
work page 2024
-
[7]
A. G. Buda, G. Coraglia, F. A. Genco, C. Manganini, and G. Primiero. Bias amplifi- cation chains in ml-based systems with an application to credit scoring, Jan 2023
work page 2023
-
[8]
A. G. Buda, G. Coraglia, F. A. Genco, C. Manganini, and G. Primiero. Bias amplifica- tion chains in ml-based systems with an application to credit scoring. In G. Coraglia, F. A. D’Asaro, A. Dyoub, F. A. Lisi, and G. Primiero, editors,Proceedings of the 3rd Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Program- ming co-locate...
work page 2024
-
[9]
A. G. Buda, G. Coraglia, F. A. Genco, C. Manganini, and G. Primiero. Assessing the risk of discrimination with BRIO: A use case from the financial sector. In D. Pedreschi, M. Milano, I. Tiddi, S. Russell, C. Boldrini, L. Pappalardo, A. Passerini, and S. Wang, editors,Proceedings of the 4th International Conference on Hybrid Human-Artificial Intelligence, ...
work page 2025
-
[10]
A. G. Buda and G. Primiero. A logic for using information.Logique Et Analyse, 265(n/a):59–103, 2025
work page 2025
-
[11]
L. Ceragioli and G. Primiero. Trustworthiness preservation by copies of machine learn- ing systems.Int. J. Approx. Reason., 192:109638, 2026
work page 2026
-
[12]
T. Chen, G. Primiero, F. Raimondi, and N. Rungta. A computationally grounded, weighted doxastic logic.Stud Logica, 104(4):679–703, 2016. 10
work page 2016
- [13]
-
[14]
G. Coraglia, F. A. D’Asaro, F. A. Genco, D. Giannuzzi, D. Posillipo, G. Primiero, and C. Quaggio. Brioxalkemy: a bias detecting tool. In G. Boella, F. A. D’Asaro, A. Dyoub, L. Gorrieri, F. A. Lisi, C. Manganini, and G. Primiero, editors,Proceedings of the 2nd Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Programming co-locat...
work page 2023
-
[15]
F. A. D’Asaro, F. A. Genco, and G. Primiero. Checking trustworthiness of probabilistic computations in a typed natural deduction system.J. Log. Comput., 35(6), 2025
work page 2025
- [16]
-
[17]
Y. Guo, M. Guo, J. Su, Z. Yang, and M. Zhu. Bias in large language models: Origin, evaluation, and mitigation.arXiv preprint arXiv:2411.10915, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[18]
S. Han, S. Avestimehr, and C. He. Bridging the safety gap: A guardrail pipeline for trustworthy llm inferences. 2025
work page 2025
- [19]
- [20]
-
[21]
L. Hutschenreiter, C. Baier, and J. Klein. Parametric markov chains: Pctl complexity and fraction-free gaussian elimination.Electronic Proceedings in Theoretical Computer Science, 256:16–30, Sept. 2017
work page 2017
- [22]
-
[23]
K. Kiashemshaki, M. J. Torkamani, N. Mahmoudi, and M. S. Bilehsavar. Simulating a bias mitigation scenario in large language models. 2025
work page 2025
- [24]
-
[25]
E. Kubyshkina and G. Primiero. A possible worlds semantics for trustworthy non- deterministic computations.Int. J. Approx. Reason., 172:109212, 2024
work page 2024
-
[26]
B. Legastelois, M.-J. Lesot, and A. R. d’Allonnes. Typology of axioms for a weighted modal logic.International Journal of Approximate Reasoning, 90:341–358, 2017
work page 2017
-
[27]
L. Lin, L. Wang, J. Guo, and K.-F. Wong. Investigating bias in llm-based bias detection: Disparities between llms and human perception. InProceedings of COLING, 2025. 11
work page 2025
-
[28]
M. Maaz and T. C. Y. Chan. Formal verification of markov processes with learned parameters, 2025
work page 2025
-
[29]
L. Monteiro Paes et al. Direct steering optimization for bias mitigation.arXiv preprint arXiv:2512.15926, 2025
-
[30]
A. Moskowakis. On a generalization of quantifiers.Fundamenta Mathematicae, 1957
work page 1957
-
[31]
N. J. Nilsson. Probabilistic logic.Artificial Intelligence, 28(1):71–87, 1986
work page 1986
-
[32]
P. Seshadri, S. Singh, and Y. Elazar. The bias amplification paradox in text-to-image generation, 2023
work page 2023
-
[33]
Z. Siddique, I. Khalid, et al. Shifting perspectives: Steering vector ensembles for robust bias mitigation in llms.arXiv preprint arXiv:2503.05371, 2025
-
[34]
A. A. Syed. Guardrails for large language models: A review of techniques and challenges. URF Journals, 2025
work page 2025
-
[35]
K. W. Wagner. The complexity of combinatorial problems with succinct input repre- sentation.Acta informatica, 23(3):325–356, 1986
work page 1986
-
[36]
H. Zhou, Z. Feng, Z. Zhu, J. Qian, and K. Mao. Unibias: unveiling and mitigating llm bias through internal attention and ffn manipulation. 2024. 12
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.