arxiv: 2604.06838 · v1 · submitted 2026-04-08 · 💻 cs.AI · cs.LG

Recognition: no theorem link

Explaining Neural Networks in Preference Learning: a Post-hoc Inductive Logic Programming Approach

Daniele Fossem\`o , Filippo Mignosi , Giuseppe Placidi , Luca Raggioli , Matteo Spezialetti , Fabio Aurelio D'Asaro

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:50 UTC · model grok-4.3

classification 💻 cs.AI cs.LG

keywords explainable AIinductive logic programmingneural network approximationpreference learninganswer set programmingprincipal component analysisweak constraintspost-hoc explanation

0 comments

The pith

ILASP approximates neural networks for user preferences using weak constraints and PCA to yield transparent explanations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes using Inductive Learning of Answer Set Programs to approximate neural networks trained on user preference data over recipes. It applies weak constraints to capture preference relations and introduces PCA as preprocessing to reduce high-dimensional features while keeping the resulting logic programs interpretable. Experiments evaluate the approach both as a global approximator for the overall model and as a local one for specific cases, focusing on fidelity and computational cost. A sympathetic reader would care because it translates accurate but opaque neural preference predictors into symbolic, human-readable forms. This offers a route to explainable AI in domains where models must reflect human choices without losing predictive power.

Core claim

We propose using Learning from Answer Sets to approximate black-box models such as neural networks in the specific case of learning user preferences. We specifically explore the use of ILASP to approximate preference learning systems through weak constraints on a recipe preference dataset. Experiments investigate ILASP both as a global and a local approximator, with a PCA preprocessing step to reduce dimensionality while keeping explanations transparent.

What carries the argument

ILASP using weak constraints to approximate the target neural network, with PCA dimensionality reduction as the preprocessing step to maintain transparency.

If this is right

Black-box neural preference models can be replaced or augmented with interpretable symbolic programs without major loss of accuracy.
The same approximation technique works for both global model understanding and local prediction explanations.
High-dimensional preference data becomes manageable for symbolic approximation through PCA without sacrificing explanatory clarity.
Computational time remains controlled even as feature spaces grow, due to the preprocessing step.
Neural and logic-based methods can be combined for preference learning tasks that require both accuracy and transparency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The PCA-ILASP pipeline could be tested on preference datasets from domains other than recipes to check generality.
Hybrid systems might use the learned logic programs to audit or correct neural predictions in deployment.
The method hints at using answer set programming more broadly as a post-hoc tool for explaining neural models in ranking or recommendation tasks.

Load-bearing premise

The assumption that ILASP approximations via weak constraints can achieve appropriate fidelity to the neural network while remaining computationally tractable and transparent after PCA reduction on high-dimensional preference data.

What would settle it

An experiment in which the ILASP approximation shows substantially lower fidelity to the neural network predictions on the recipe preference dataset or produces logic programs that are no longer transparently interpretable by humans.

Figures

Figures reproduced from arXiv: 2604.06838 by Daniele Fossem\`o, Fabio Aurelio D'Asaro, Filippo Mignosi, Giuseppe Placidi, Luca Raggioli, Matteo Spezialetti.

**Figure 2.** Figure 2: Generation of sampled datasets in the global (Figure 2a) and local (Figure 2b) [PITH_FULL_IMAGE:figures/full_fig_p019_2.png] view at source ↗

**Figure 3.** Figure 3: Histogram of the absolute weights of the features for the first four principal [PITH_FULL_IMAGE:figures/full_fig_p023_3.png] view at source ↗

**Figure 4.** Figure 4: Global fidelity, precisionBB, recallBB, and execution time results for #maxp between 1 and 5, indirect PCA variables for weak constraints in a theory). To further fix environmental conditions, we ran experiments in the hard case of global approximation, indirect PCA, and n = 8. As it can be seen in Figures 4, the execution time grows exponentially both in the size of the dataset and #maxp. As can be seen f… view at source ↗

**Figure 5.** Figure 5: Execution time comparison for global approximation across No PCA, Indirect [PITH_FULL_IMAGE:figures/full_fig_p033_5.png] view at source ↗

read the original abstract

In this paper, we propose using Learning from Answer Sets to approximate black-box models, such as Neural Networks (NN), in the specific case of learning user preferences. We specifically explore the use of ILASP (Inductive Learning of Answer Set Programs) to approximate preference learning systems through weak constraints. We have created a dataset on user preferences over a set of recipes, which is used to train the NNs that we aim to approximate with ILASP. Our experiments investigate ILASP both as a global and a local approximator of the NNs. These experiments address the challenge of approximating NNs working on increasingly high-dimensional feature spaces while achieving appropriate fidelity on the target model and limiting the increase in computational time. To handle this challenge, we propose a preprocessing step that exploits Principal Component Analysis to reduce the dataset's dimensionality while keeping our explanations transparent. Under consideration for publication in Theory and Practice of Logic Programming (TPLP).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ILASP approximates NN preference models on recipes via weak constraints and PCA, but the transparency claim after PCA is the main weak point.

read the letter

The main thing here is that the authors train neural nets on a custom recipe preference dataset and then use ILASP with weak constraints to approximate those nets, both globally and locally, after applying PCA to cut dimensionality. They position this as a way to get explanations for black-box preference learners while keeping computation feasible. This specific pairing of ILASP approximation with PCA preprocessing on high-dimensional preference data is new; earlier ILP explanation work exists but not in this targeted setup. The paper does a reasonable job laying out the pipeline and identifying the scalability issue in high-dim spaces. The soft spot is the transparency part. PCA turns original features like ingredients into linear combinations, so the resulting answer set programs operate on those components rather than human-readable attributes. The abstract claims explanations stay transparent, yet there is no mapping back to originals, no user study, and no comparison showing the programs are actually easier to understand than the original NN. Without fidelity numbers or runtime data in the provided text, it is also hard to judge how well the approximations actually perform. This is for people working on inductive logic programming for explainable recommendation systems who want to see a concrete application. A reader already familiar with ILASP and preference learning could pick up the experimental framing. I would send it to peer review so the full results can be examined for actual metrics and to test whether the post-PCA programs deliver usable explanations.

Referee Report

2 major / 1 minor

Summary. The paper proposes a post-hoc method to explain neural networks trained on user recipe preferences by approximating them with ILASP (Inductive Learning of Answer Set Programs) using weak constraints. It examines ILASP both as a global approximator and a local one, creates a preference dataset for training the target NNs, and introduces PCA-based dimensionality reduction to address high-dimensional feature spaces while claiming to preserve explanation transparency. Experiments aim to balance fidelity to the NN, computational tractability, and interpretability.

Significance. If the fidelity and transparency claims hold with supporting quantitative evidence, this would offer a concrete bridge between neural preference models and symbolic logic programs, advancing explainable AI in recommendation domains. The use of a real user-preference dataset and explicit handling of dimensionality challenges via PCA is a practical strength; successful global/local comparisons could inform hybrid neuro-symbolic systems.

major comments (2)

[Preprocessing with PCA] The preprocessing step with PCA (described in the abstract and method): the assertion that explanations remain transparent after dimensionality reduction is load-bearing for the post-hoc motivation, yet PCA produces uncorrelated linear combinations of original features (e.g., recipe attributes) that lack direct human-interpretable semantics. No mapping from learned weak constraints back to original features is provided, nor is there an interpretability evaluation or user study comparing post-PCA programs to the original NN.
[Experiments] Experiments section: the central claim that ILASP achieves 'appropriate fidelity' while remaining tractable requires concrete metrics (e.g., approximation accuracy or error rates relative to the NN, runtime figures, and baseline comparisons). The abstract provides none, and without these the evaluation of global vs. local approximation and the PCA benefit cannot be verified.

minor comments (1)

[Abstract] Abstract: the phrase 'keeping our explanations transparent' is repeated without a precise definition of transparency in this context (e.g., syntactic simplicity of the ASP program or semantic alignment with original features).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback, which helps clarify the strengths and areas for improvement in our post-hoc ILASP approximation approach for neural preference models. We address each major comment point by point below, with honest indications of where revisions will strengthen the manuscript without altering its core claims.

read point-by-point responses

Referee: [Preprocessing with PCA] The preprocessing step with PCA (described in the abstract and method): the assertion that explanations remain transparent after dimensionality reduction is load-bearing for the post-hoc motivation, yet PCA produces uncorrelated linear combinations of original features (e.g., recipe attributes) that lack direct human-interpretable semantics. No mapping from learned weak constraints back to original features is provided, nor is there an interpretability evaluation or user study comparing post-PCA programs to the original NN.

Authors: We acknowledge that PCA produces linear combinations of the original recipe features, which can reduce direct semantic interpretability of the input space. Our transparency claim centers on the output of ILASP: the learned weak constraints remain symbolic and human-readable logic programs, providing a post-hoc explanation distinct from the NN's opaque parameters. To address the concern directly, we will revise the manuscript to (i) include the PCA loadings matrix with examples mapping principal components back to dominant original attributes (e.g., ingredient or nutritional features), (ii) show how a sample weak constraint on a PC can be re-expressed in terms of those attributes, and (iii) add a qualitative interpretability analysis comparing program complexity and readability before and after PCA. A full user study lies outside the current scope and resources, but the added mapping and analysis will allow readers to assess transparency. These changes will be incorporated in the revised version. revision: yes
Referee: [Experiments] Experiments section: the central claim that ILASP achieves 'appropriate fidelity' while remaining tractable requires concrete metrics (e.g., approximation accuracy or error rates relative to the NN, runtime figures, and baseline comparisons). The abstract provides none, and without these the evaluation of global vs. local approximation and the PCA benefit cannot be verified.

Authors: The experiments section reports fidelity via agreement rates between ILASP predictions and the target NN on held-out preference data, plus runtime measurements across global/local modes and varying PCA dimensions. We agree these results benefit from more explicit presentation. In revision we will add a dedicated table with concrete values: approximation accuracy (percentage of matching predictions), error rates (e.g., disagreement percentage or MSE on preference scores), runtimes in seconds, and direct comparisons to baselines such as linear models and decision-tree approximators. We will also revise the abstract to summarize the key quantitative outcomes (e.g., fidelity levels retained with PCA). This will make the global vs. local and PCA-benefit claims verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical approximation method with no self-referential derivations

full rationale

The paper proposes an empirical post-hoc method: train neural networks on a custom recipe preference dataset, then apply ILASP (via weak constraints) as global or local approximators, with optional PCA preprocessing for high-dimensional inputs. No equations, uniqueness theorems, or derivations are presented that reduce to fitted parameters or prior results by construction. Claims of fidelity and transparency rest on experimental evaluation rather than self-definition or load-bearing self-citations. The approach is self-contained as a methodological proposal tested on held-out data, with no patterns matching self-definitional, fitted-prediction, or ansatz-smuggling circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that answer-set programs with weak constraints can serve as faithful post-hoc approximations to neural preference models and that PCA preserves preference-relevant structure. No free parameters are explicitly fitted in the abstract description. No new entities are postulated.

axioms (2)

domain assumption Neural networks trained on preference data can be approximated by inductive logic programs learned via ILASP.
Invoked in the proposal to use ILASP for approximation of NNs.
domain assumption Principal Component Analysis reduces dimensionality without destroying the preference structure needed for faithful approximation.
Stated as the proposed preprocessing step to handle high-dimensional spaces.

pith-pipeline@v0.9.0 · 5479 in / 1318 out tokens · 33324 ms · 2026-05-10T17:50:32.690161+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 7 canonical work pages

[1]

Guidotti, A

R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, 2018,A survey of methods for explaining black-box models, ACM computing surveys (CSUR), 51, 1 -

2018
[2]

Calimeri, W

F. Calimeri, W. Faber, M. Gebser, G. Ianni, R. Kaminski, T. Krennwallner, N. Leone, M. Maratea, F. Ricca, T. Schaub, 2020,ASP-Core-2 input language format, Theory and Practice of Logic Programming 20 294-309. M. Law, A. Russo, K. Broda, 2014,Inductive learning of answer set programs, European Work- shop on Logics in Artificial Intelligence, Springer, pp. ...

2020
[3]

M. Law, A. Russo, K. Broda, 2018,Inductive learning of answer set programs from noisy exam- ples, arXiv preprint arXiv:1808.08441. 38D. Fossem` o et al. F. A. D’Asaro, M. Spezialetti, L. Raggioli, S. Rossi, 2020,Towards an inductive logic pro- gramming approach for explaining black-box preference learning systems, Proceedings of the International Conferen...

work page arXiv 2018
[4]

doi:https://doi.org/10.24963/kr.2020/88. D. Fossem` o, F. Mignosi, L. Raggioli, M. Spezialetti, F. D’Asaro, 2022,Using inductive logic programming to globally approximate neural networks for preference learning: challenges and preliminar results, BEWARE 2022: BIAS, Ethical AI, Explainability and the Role of Logic and Logic Programming 67-83. URL:https://c...

work page doi:10.24963/kr.2020/88 2020
[5]

doi:10.3389/frai.2020.00036 T

URL:https://www.frontiersin.org/ article/10.3389/frai.2020.00036. doi:10.3389/frai.2020.00036 T. Kamishima, 2003,Nantonac collaborative filtering: recommendation based on order responses, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 583-588. J. Rabold, M. Siebers, U. Schmid, 2018,Explaining black...

work page doi:10.3389/frai.2020.00036 2020
[6]

Rabold, H

J. Rabold, H. Deininger, M. Siebers, U. Schmid, 2020,Enriching visual with verbal explanations for relational concepts - combining lime with aleph, P. Cellier, K. Driessens (Eds.), Machine Learning and Knowledge Discovery in Databases, Springer, pp. 180 -

2020
[7]

Shakerin, G

F. Shakerin, G. Gupta, 2019,Induction of non-monotonic logic programs to explain boosted tree models using lime, Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp. 3052 -

2019
[8]

Srinivasan, 2019,The Aleph manual, URL:http://www.cs.ox.ac.uk/activities/ machinelearning/Aleph/

A. Srinivasan, 2019,The Aleph manual, URL:http://www.cs.ox.ac.uk/activities/ machinelearning/Aleph/. B. De Ville, 2013Decision trees, Wiley Interdisciplinary Reviews: Computational Statistics 5(6), 448-445. doi:https://doi.org/10.1002/wics.1278 C. Rebelo de S´ a, W. Duivesteijn, C. Soares, A. Knobbe, 2016,Exceptional preferences mining, T. Calders, M. Cec...

work page doi:10.1002/wics.1278 2019
[9]

"Why Should

M. T. Ribeiro, S. Singh, C. Guestrin, 2016,“Why should I trust you?”: explaining the pre- dictions of any classifier, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, Association for Computing Machin- ery, New York, NY, USA, p. 1135-1144. URL:https://doi.org/10.1145/2939672.2939778. doi:10.1145/2...

work page doi:10.1145/2939672.2939778 2016
[10]

F¨ urnkranz, 2010,Preference learning, Springer-Verlag Berlin Heidelberg

J. F¨ urnkranz, 2010,Preference learning, Springer-Verlag Berlin Heidelberg. M. Gurrieri, X. Siebert, P. Fortemps, S. Greco, R. S lowi´ nski, 2012,Advances on Computational Intelligence, pp. 613 -

2010
[11]

Y. Zhou, Y. Liu, J. Yang, X. He, L. Liu, 2014,A taxonomy of label ranking algorithms, Journal of Computers 9, 557 -

2014
[12]

Abdi, L.J

H. Abdi, L.J. Williams, 2010,Principal Component Analysis, Wiley interdisciplinary reviews: computational statistics 2, 433 -

2010
[13]

Gretton, K.M

A. Gretton, K.M. Borgwardt, M.J. Rasch, B. Sch¨ olkopf, A. Smola, 2012,A kernel two-sample, Journal of Machine Learning Research 13, 723 -

2012
[14]

Nicolás, The bar derived category of a curved dg algebra, Journal of Pure and Applied Algebra 212 (2008) 2633–2659

Explaining Neural Networks with Post-hoc ILP in Preference Learning39 M. Soori, B. Arezoo, R. Dastres, 2023,Artificial intelligence, machine learning and deep learning in advanced robotics, a review, Cognitive Robotics 3, 54-70. doi:https://doi.org/10.1016/ j.cogr.2023.04.001 P. Rajpurkar, M.P. Lungren, 2023,The Current and Future State of AI Interpretati...

work page doi:10.1016/j 2023
[15]

Maxwell Harper and Joseph A

A. Severyn, A. Moschitti, 2015,Learning to rank short text pairs with convolutional deep neu- ral networks, Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, 373-382. B. Li, Y. Fan, M.E. Papka, Z. Lan, 2022,Encoding for reinforcement learning driven schedul- ing, Workshop on Job Scheduling Str...

work page doi:10.1145/2827872 2015