pith. machine review for the scientific record. sign in

arxiv: 2604.05136 · v1 · submitted 2026-04-06 · 💻 cs.AI

Recognition: 2 theorem links

· Lean Theorem

Non-monotonic causal discovery with Kolmogorov-Arnold Fuzzy Cognitive Maps

Jose L. Salmeron

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:14 UTC · model grok-4.3

classification 💻 cs.AI
keywords fuzzy cognitive mapskolmogorov-arnold representationnon-monotonic causalityb-spline functionsinterpretable aicausal modelingneuro-symbolic methods
0
0 comments X

The pith

KA-FCMs place learnable B-spline functions on graph edges to model arbitrary non-monotonic causal relationships.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Kolmogorov-Arnold Fuzzy Cognitive Maps to overcome a core limit in standard FCMs. Conventional versions rely on scalar weights and monotonic node activations, so they cannot represent effects that rise then fall or oscillate. The new architecture draws on the Kolmogorov-Arnold theorem to move the non-linearity onto the edges themselves by fitting a univariate B-spline function to each causal link. This keeps the graph sparse and interpretable yet allows each connection to express any shape of influence. Experiments across non-monotonic inference, symbolic regression, and chaotic forecasting show the maps outperform regular FCMs and match multilayer perceptrons while still letting users read the learned causal functions directly from the edges.

Core claim

By replacing static scalar weights with learnable univariate B-spline functions on the edges, the KA-FCM redefines causal transmission so that non-linearity resides in the influence phase rather than the aggregation phase. This change permits arbitrary non-monotonic dependencies without added nodes or denser graphs, as confirmed by higher accuracy than particle-swarm-trained standard FCMs and competitive results with MLPs on the Yerkes-Dodson law, symbolic regression, and chaotic time-series tasks, while the graph structure remains intact for interpretability and explicit law extraction.

What carries the argument

Learnable univariate B-spline functions placed on the edges of the fuzzy cognitive map, which transmit causal influences according to the Kolmogorov-Arnold representation theorem.

If this is right

  • KA-FCMs can represent saturation effects and periodic dynamics directly on the causal links.
  • Graph-based interpretability is preserved, enabling extraction of explicit mathematical laws from the learned edge functions.
  • Recurrent inference remains possible while accuracy exceeds that of particle-swarm-optimized standard FCMs.
  • Performance reaches levels comparable to multilayer perceptrons without sacrificing neuro-symbolic structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same edge-function replacement could be tested in other causal-graph architectures to increase their expressive power without adding hidden layers.
  • Domains that contain known non-linear feedback loops, such as certain biological or economic processes, become candidates for direct function readout rather than black-box fitting.
  • Scaling experiments on larger graphs would reveal whether the spline parameterization remains stable and interpretable at higher edge counts.

Load-bearing premise

That univariate B-spline functions fitted to individual edges can faithfully approximate the true non-monotonic causal dependencies present in the data and that the training procedure will reliably recover stable, useful functions.

What would settle it

A controlled test on data generated from a known non-monotonic relation such as a quadratic or sinusoidal function in which the learned edge splines produce visibly wrong shapes or fail to outperform a standard monotonic FCM on held-out predictions.

Figures

Figures reproduced from arXiv: 2604.05136 by Jose L. Salmeron.

Figure 1
Figure 1. Figure 1: Causal relationship comparison [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Kolmogorov-Arnold Fuzzy Cognitive Map architecture. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparative modeling of the non-monotonic Yerkes-Dodson law. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: One-step-ahead forecasting performance on the chaotic Mackey-Glass [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

Fuzzy Cognitive Maps constitute a neuro-symbolic paradigm for modeling complex dynamic systems, widely adopted for their inherent interpretability and recurrent inference capabilities. However, the standard FCM formulation, characterized by scalar synaptic weights and monotonic activation functions, is fundamentally constrained in modeling non-monotonic causal dependencies, thereby limiting its efficacy in systems governed by saturation effects or periodic dynamics. To overcome this topological restriction, this research proposes the Kolmogorov-Arnold Fuzzy Cognitive Map (KA-FCM), a novel architecture that redefines the causal transmission mechanism. Drawing upon the Kolmogorov-Arnold representation theorem, static scalar weights are replaced with learnable, univariate B-spline functions located on the model edges. This fundamental modification shifts the non-linearity from the nodes' aggregation phase directly to the causal influence phase. This modification allows for the modeling of arbitrary, non-monotonic causal relationships without increasing the graph density or introducing hidden layers. The proposed architecture is validated against both baselines (standard FCM trained with Particle Swarm Optimization) and universal black-box approximators (Multi-Layer Perceptron) across three distinct domains: non-monotonic inference (Yerkes-Dodson law), symbolic regression, and chaotic time-series forecasting. Experimental results demonstrate that KA-FCMs significantly outperform conventional architectures and achieve competitive accuracy relative to MLPs, while preserving graph- based interpretability and enabling the explicit extraction of mathematical laws from the learned edges.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes Kolmogorov-Arnold Fuzzy Cognitive Maps (KA-FCMs) as an extension of standard FCMs, replacing scalar synaptic weights with learnable univariate B-spline functions on the edges. This shift, motivated by the Kolmogorov-Arnold representation theorem, moves non-linearity to the causal influence phase to enable modeling of arbitrary non-monotonic relationships (e.g., saturation or periodic effects) without added graph density or hidden layers. The architecture is evaluated on three tasks—non-monotonic inference via the Yerkes-Dodson law, symbolic regression, and chaotic time-series forecasting—claiming significant outperformance over PSO-trained FCM baselines and competitive accuracy with MLPs while preserving graph interpretability for explicit mathematical law extraction from edges.

Significance. If the empirical claims hold under rigorous validation, the work would offer a meaningful advance in neuro-symbolic modeling by combining the interpretability and recurrent dynamics of FCMs with flexible non-monotonic function approximation. This could benefit causal discovery and system modeling in domains with non-monotonic dependencies, providing a graph-structured alternative to black-box models like MLPs while enabling direct extraction of functional forms. The approach credits the Kolmogorov-Arnold Networks inspiration for the edge-based univariate functions.

major comments (3)
  1. [Abstract] Abstract and experimental validation: The abstract asserts that KA-FCMs 'significantly outperform conventional architectures and achieve competitive accuracy relative to MLPs' across three domains, yet supplies no quantitative metrics, error bars, statistical tests, data splits, training details, or baseline comparisons. This absence makes the central performance claim impossible to evaluate and is load-bearing for the paper's contribution.
  2. [Method / Recurrent Inference] The stability of recurrent inference under non-monotonic B-spline activations is not analyzed. Because the model applies the learned univariate functions repeatedly in the FCM dynamics (rather than monotonic node activations), local approximation errors from finite-knot B-splines could amplify into divergence or chaotic behavior, particularly in the chaotic time-series task; no convergence guarantees, sensitivity analysis, or ablation on knot degree/placement is provided.
  3. [Experiments / Interpretability] Interpretability claim: The paper states that mathematical laws can be 'explicitly extracted' from the learned edges, but provides no concrete examples, extraction procedure, or quantitative measure of how faithfully the extracted B-spline expressions recover ground-truth non-monotonic relations in the symbolic regression or Yerkes-Dodson experiments.
minor comments (2)
  1. [Method] Notation for the B-spline functions (knot vectors, degree, coefficients) should be defined more explicitly with an equation in the method section to avoid ambiguity when comparing to standard FCM weight matrices.
  2. [Abstract / Introduction] The abstract and introduction would benefit from a brief statement of the optimization procedure (PSO vs. gradient-based) used for the B-spline coefficients, as this directly affects reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating planned revisions where the manuscript can be strengthened without misrepresenting our contributions.

read point-by-point responses
  1. Referee: [Abstract] Abstract and experimental validation: The abstract asserts that KA-FCMs 'significantly outperform conventional architectures and achieve competitive accuracy relative to MLPs' across three domains, yet supplies no quantitative metrics, error bars, statistical tests, data splits, training details, or baseline comparisons. This absence makes the central performance claim impossible to evaluate and is load-bearing for the paper's contribution.

    Authors: We agree that including key quantitative indicators in the abstract would improve immediate evaluability. The full manuscript already contains the requested details (metrics, error bars, data splits, training procedures, and baseline comparisons) in the Experiments section. We will revise the abstract to incorporate representative quantitative results, such as mean errors with standard deviations and brief mention of statistical comparisons, while keeping the abstract concise. revision: yes

  2. Referee: [Method / Recurrent Inference] The stability of recurrent inference under non-monotonic B-spline activations is not analyzed. Because the model applies the learned univariate functions repeatedly in the FCM dynamics (rather than monotonic node activations), local approximation errors from finite-knot B-splines could amplify into divergence or chaotic behavior, particularly in the chaotic time-series task; no convergence guarantees, sensitivity analysis, or ablation on knot degree/placement is provided.

    Authors: This is a substantive point on the recurrent dynamics. Our empirical results across the chaotic forecasting experiments exhibited stable convergence without observed divergence, but we did not include formal analysis or ablations. In revision we will add sensitivity analysis on knot count and degree, plus ablation studies on these hyperparameters for the time-series task. Theoretical convergence guarantees for arbitrary learned non-monotonic functions in recurrent settings lie outside the paper's empirical scope and will be noted as a limitation. revision: partial

  3. Referee: [Experiments / Interpretability] Interpretability claim: The paper states that mathematical laws can be 'explicitly extracted' from the learned edges, but provides no concrete examples, extraction procedure, or quantitative measure of how faithfully the extracted B-spline expressions recover ground-truth non-monotonic relations in the symbolic regression or Yerkes-Dodson experiments.

    Authors: We concur that explicit demonstrations would better support the interpretability claims. The manuscript motivates extraction via the univariate B-spline representation but lacks worked examples and fidelity metrics. We will add concrete extraction examples from the symbolic regression and Yerkes-Dodson experiments, describe the procedure (reading spline coefficients and reconstructing the univariate function), and report quantitative measures such as mean squared error between extracted functions and ground-truth relations. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in claimed derivation

full rationale

The paper defines KA-FCM by replacing scalar weights with edge-wise univariate B-spline functions, invoking the Kolmogorov-Arnold theorem to justify the shift of nonlinearity to the causal phase. This is a direct architectural modification whose claimed ability to represent non-monotonic relations follows from the univariate approximators and the recurrent FCM update rule; it does not reduce any result to a fitted quantity or prior self-citation by construction. Validation consists of independent comparisons to PSO-trained FCM and MLP baselines on external tasks (Yerkes-Dodson, symbolic regression, time-series), with no load-bearing step that equates a prediction to its own inputs or renames a known pattern. The derivation chain remains self-contained against the stated assumptions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the Kolmogorov-Arnold theorem justifying univariate edge functions and on the empirical claim that B-splines plus optimization suffice for the reported tasks; no new physical entities are introduced.

free parameters (1)
  • B-spline coefficients and knot placements
    These parameters of the univariate functions on each edge are fitted during learning and directly determine the shape of each causal relationship.
axioms (1)
  • standard math Kolmogorov-Arnold representation theorem permits representation of multivariate functions via sums of univariate functions
    Invoked to motivate moving non-linearity from node activations to edge functions.

pith-pipeline@v0.9.0 · 5539 in / 1328 out tokens · 74415 ms · 2026-05-10T19:14:14.111897+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    Fuzzy cognitive maps,

    B. Kosko, “Fuzzy cognitive maps,”International Journal of Man- Machine Studies, vol. 24, no. 1, pp. 65–75, 1986

  2. [2]

    Medical diagnosis of rheumatoid arthritis using data driven pso-fcm,

    J. L. Salmeron, S. Rahimi, A. Navalie, and A. Sadeghpour, “Medical diagnosis of rheumatoid arthritis using data driven pso-fcm,”Neurocom- puting, vol. 232, pp. 104–112, 2017

  3. [3]

    Learning fcms with multi-local and balanced memetic algorithms for forecasting drying processes,

    J. L. Salmeron, A. Ruiz-Celma, and A. Mena, “Learning fcms with multi-local and balanced memetic algorithms for forecasting drying processes,”Neurocomputing, vol. 232, pp. 52–57, 2017

  4. [4]

    Deep fuzzy cognitive maps for interpretable multivariate time series prediction,

    J. Wang, Z. Peng, X. Wang, and J. Wu, “Deep fuzzy cognitive maps for interpretable multivariate time series prediction,”IEEE Transactions on Fuzzy Systems, vol. 28, no. 7, pp. 1–1, 2020

  5. [5]

    KAN: Kolmogorov-Arnold Networks

    Z. Liu, Y . Wang, S. Vaidya, F. Ruehle, J. Halverson, M. Solja ˇci´c, T. Y . Hou, and M. Tegmark, “Kan: Kolmogorov-arnold networks,”arXiv preprint arXiv:2404.19756, 2024

  6. [6]

    On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition,

    A. N. Kolmogorov, “On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition,”Doklady Akademii Nauk SSSR, vol. 114, no. 5, pp. 953– 956, 1957

  7. [7]

    Blind federated learning without initial model,

    J. L. Salmeron and I. Arevalo, “Blind federated learning without initial model,”Journal of Big Data, vol. 11, no. 56, pp. 1–31, 2024

  8. [8]

    Concurrent vertical and horizontal federated learning with fuzzy cognitive maps,

    ——, “Concurrent vertical and horizontal federated learning with fuzzy cognitive maps,”Future Generation Computer Systems, vol. 162, p. 107482, 2025

  9. [9]

    Construction and super- vised learning of long-term grey cognitive networks,

    G. N ´apoles, J. L. Salmeron, and K. Vanhoof, “Construction and super- vised learning of long-term grey cognitive networks,”IEEE Transactions on Cybernetics, vol. 51, no. 2, pp. 686–695, 2021

  10. [10]

    Pseudoinverse learning of fuzzy cognitive maps for multivariate time series forecasting,

    F. Vanhoenshoven, G. Napoles, W. Froelich, J. L. Salmeron, and K. Van- hoof, “Pseudoinverse learning of fuzzy cognitive maps for multivariate time series forecasting,”Applied Soft Computing, vol. 95, p. 106461, 2020. SALMERON: NON-MONOTONIC CAUSAL DISCOVERY WITH KOLMOGOROV-ARNOLD FCM 9

  11. [11]

    Learning fuzzy cognitive maps with modified asexual reproduction optimization algorithm,

    J. L. Salmeron, T. Mansouri, M. Moghadam, and A. Mardani, “Learning fuzzy cognitive maps with modified asexual reproduction optimization algorithm,”Knowledge-Based Systems, vol. 163, pp. 723–735, 2019

  12. [12]

    Uncertainty propagation in fuzzy grey cog- nitive maps with hebbian-like learning algorithms,

    J. L. Salmeron and P. Palos, “Uncertainty propagation in fuzzy grey cog- nitive maps with hebbian-like learning algorithms,”IEEE Transactions on Cybernetics, vol. 49, no. 1, pp. 211–220, 2019

  13. [13]

    Inverse simulation learning of quasi-nonlinear fuzzy cognitive maps,

    G. Napoles, J. L. Salmeron, and Y . Salgueiro, “Inverse simulation learning of quasi-nonlinear fuzzy cognitive maps,”Neurocomputing, vol. 650, p. 130864, 2025

  14. [14]

    Fast and effective learning for fuzzy cognitive maps: A method based on solving constrained convex optimization problems,

    L. Wu, G. Feng, X. Liu, W. Pedrycz, L. Zhang, and J. Yang, “Fast and effective learning for fuzzy cognitive maps: A method based on solving constrained convex optimization problems,”IEEE Transactions on Fuzzy Systems, vol. 29, no. 11, pp. 2958–2971, 2020

  15. [15]

    Learning algorithms for fuzzy cognitive maps–a review study,

    E. I. Papageorgiou, C. D. Stylios, and P. P. Groumpos, “Learning algorithms for fuzzy cognitive maps–a review study,”IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 34, no. 1, pp. 86–97, 2004

  16. [16]

    Dynamic optimization of fuzzy cognitive maps for time series forecasting,

    J. L. Salmeron and W. Froelich, “Dynamic optimization of fuzzy cognitive maps for time series forecasting,”Knowledge-Based Systems, vol. 105, pp. 29–37, 2016

  17. [17]

    Time-series fore- casting using improved empirical fourier decomposition and high-order intuitionistic fcm: Applications in smart manufacturing systems,

    A. Nikseresht, M. Zandieh, and M. Shokouhifar, “Time-series fore- casting using improved empirical fourier decomposition and high-order intuitionistic fcm: Applications in smart manufacturing systems,”IEEE Transactions on Fuzzy Systems, vol. 33, no. 12, pp. 4201–4213, 2025

  18. [18]

    Equipping high-order fuzzy cognitive map with interpretable weights for multivariate time series forecasting,

    Y . Wang and W. Pedrycz, “Equipping high-order fuzzy cognitive map with interpretable weights for multivariate time series forecasting,”IEEE Transactions on Fuzzy Systems, 2023

  19. [19]

    Modelling grey uncertainty with fuzzy grey cognitive maps,

    J. L. Salmeron, “Modelling grey uncertainty with fuzzy grey cognitive maps,”Expert Systems with Applications, vol. 37, no. 12, pp. 7581–7588, 2010

  20. [20]

    A fuzzy grey cognitive maps-based intelligent security system,

    ——, “A fuzzy grey cognitive maps-based intelligent security system,” in2015 IEEE International Conference on Grey Systems and Intelligent Services (GSIS), Leicester, UK, Aug 2015, pp. 151–156

  21. [21]

    Evolutionary learning of fuzzy grey cognitive maps for the forecasting of multivariate, interval-valued time series,

    W. Froelich and J. L. Salmeron, “Evolutionary learning of fuzzy grey cognitive maps for the forecasting of multivariate, interval-valued time series,”International Journal of Approximate Reasoning, vol. 55, no. 6, pp. 1319–1335, 2014

  22. [22]

    Fuzzy grey cognitive maps and nonlinear hebbian learning in process control,

    J. L. Salmeron and E. I. Papageorgiou, “Fuzzy grey cognitive maps and nonlinear hebbian learning in process control,”Applied Intelligence, vol. 41, no. 1, pp. 223–234, 2014

  23. [23]

    An autonomous fgcm-based system for surveillance assets coordination,

    J. L. Salmeron, “An autonomous fgcm-based system for surveillance assets coordination,”The Journal of Grey Systems, vol. 28, no. 1, pp. 27–35, 2016

  24. [24]

    A comprehensive framework for de- signing and learning fuzzy cognitive maps at the granular level,

    W. Pedrycz and W. Homenda, “A comprehensive framework for de- signing and learning fuzzy cognitive maps at the granular level,”IEEE Transactions on Fuzzy Systems, 2022

  25. [25]

    Extension to a fuzzy cognitive maps-based approach for modelling granular time series for forecasting tasks,

    P. Vazan and et al., “Extension to a fuzzy cognitive maps-based approach for modelling granular time series for forecasting tasks,”Knowledge- Based Systems, 2023

  26. [26]

    Forecasting vix using interpretable kolmogorov-arnold networks,

    S.-Y . Cho, S. Lee, and H.-G. Kim, “Forecasting vix using interpretable kolmogorov-arnold networks,”Expert Systems with Applications, vol. 294, p. 128781, 2025. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0957417425023991

  27. [27]

    Explainability analysis: An in-depth comparison between fuzzy cognitive maps and lamda,

    J. Aguilar and et al., “Explainability analysis: An in-depth comparison between fuzzy cognitive maps and lamda,”Applied Soft Computing, 2023

  28. [28]

    Lrp- based path relevances for global explanation of deep architectures,

    R. Guerrero-Gomez-Olmedo, J. L. Salmeron, and C. Kuchkovsky, “Lrp- based path relevances for global explanation of deep architectures,” Neurocomputing, vol. 381, pp. 252–260, 2020

  29. [29]

    Learning high-order fuzzy cognitive maps via multimodal artificial bee colony algorithm and nearest-better clustering: Applications on multivariate time series prediction,

    Z. Li, X. Liu, Y . Zhang, J. Qin, W. X. Zheng, and J. Wang, “Learning high-order fuzzy cognitive maps via multimodal artificial bee colony algorithm and nearest-better clustering: Applications on multivariate time series prediction,”Knowledge-Based Systems, vol. 295, p. 111771, 2024. [Online]. Available: https://www.sciencedirect. com/science/article/pii/...

  30. [30]

    Estimating the limit state space of quasi-nonlinear fuzzy cognitive maps,

    L. Concepci ´on, G. N ´apoles, A. Jastrzkebska, I. Grau, and Y . Salgueiro, “Estimating the limit state space of quasi-nonlinear fuzzy cognitive maps,”Applied Soft Computing, vol. 69, p. 112604, 2025

  31. [31]

    Benchmarking main activation functions in fuzzy cognitive maps,

    S. Bueno and J. L. Salmeron, “Benchmarking main activation functions in fuzzy cognitive maps,”Expert Systems with Applications, vol. 36, no. 3 part. 1, pp. 5221–5229, 2009

  32. [32]

    Artificial neural network model based on kolmogorov-arnold representation theorem and retention mechanism for real-time aircraft flight phases classification,

    P. Tomiło, J. Laskowski, and A. Laskowska, “Artificial neural network model based on kolmogorov-arnold representation theorem and retention mechanism for real-time aircraft flight phases classification,” Engineering Applications of Artificial Intelligence, vol. 160, p. 112004, 2025. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S...

  33. [33]

    R. L. Burden, J. D. Faires, and A. M. Burden,Numerical Analysis, 10th ed. Boston, MA: Cengage Learning, 2015

  34. [34]

    de Boor,A Practical Guide to Splines

    C. de Boor,A Practical Guide to Splines. New York: Springer-Verlag, 2001. VI. BIOGRAPHY Jose L. Salmeronis a Professor in Artificial Intelligence with CUNEF University. He has almost 30 years of experience in technology and research, including academic positions at several universities, consulting in the IT industry, and a wide range of collaborations wit...