pith. sign in

arxiv: 2604.08432 · v2 · pith:LO6524MPnew · submitted 2026-04-09 · ⚛️ physics.optics · cs.AI

Small-scale photonic Kolmogorov-Arnold networks using standard telecom nonlinear modules

Pith reviewed 2026-05-21 09:24 UTC · model grok-4.3

classification ⚛️ physics.optics cs.AI
keywords photonic neural networksKolmogorov-Arnold networkssemiconductor optical amplifiersMach-Zehnder interferometersoptical computingnonlinear opticsmachine learning hardwaretelecom components
0
0 comments X

The pith

Small photonic networks using standard telecom modules achieve high accuracy on nonlinear classification and regression tasks with far fewer parameters than software KANs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces small-scale photonic Kolmogorov-Arnold networks built entirely from commodity telecommunications hardware. Each network connection uses a module containing a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable attenuators to create a trainable four-parameter nonlinear response based on gain saturation and interference. Despite this constrained form, networks with only four modules reach 94.3 percent accuracy on classification benchmarks while seven-module versions attain an R-squared value of 0.986 on regression problems. The work uses a fully differentiable physics model to optimize the optical parameters end-to-end and shows that performance holds under realistic impairments such as limited bit resolution and added noise. This establishes a direct route from simulation to hardware realization of photonic KANs without electronic nonlinear stages.

Core claim

Replacing linear optical meshes with a small number of trainable nonlinear modules—each consisting of a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable optical attenuators—allows fully optical KANs to deliver strong inference performance on classification, regression, and image tasks while using significantly fewer parameters than equivalent software networks. A four-module implementation reaches 94.3 percent accuracy (IQR 90.3–97.4 percent across ten seeds) on nonlinear classification; a seven-module network reaches R² = 0.986 ± 0.015 on six-input regression. The approach remains effective down to 6-bit input resolution and 14 dB signal-to-noise ratio.

What carries the argument

The four-parameter optical transfer function obtained from gain saturation in the semiconductor optical amplifier combined with interferometric mixing in the Mach-Zehnder interferometer, which supplies the nonlinear activation for each edge in the KAN graph.

If this is right

  • Networks of only four optical modules reach 94.3 percent median accuracy on nonlinear classification benchmarks.
  • Seven-module networks attain R² = 0.986 on six-input regression tasks.
  • Performance stays high under 6-bit input resolution and 14 dB signal-to-noise ratio.
  • End-to-end optimization via a differentiable physics model removes the need for separate electronic nonlinear stages.
  • The architecture uses substantially fewer parameters than software KAN baselines while remaining fully optical.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • All-optical inference becomes feasible without repeated optical-to-electrical conversions for small KAN models.
  • The same module design could be tested on other photonic computing primitives beyond KANs.
  • Scaling beyond seven modules would require checking cumulative effects of noise and parameter drift in longer optical paths.
  • Experimental realization on a single photonic integrated circuit would directly test the simulation-to-hardware pathway.

Load-bearing premise

The specific nonlinear response produced by gain saturation and interferometric mixing in each Mach-Zehnder-plus-SOA-plus-attenuator module is expressive enough to support effective KAN-style learning despite having only four adjustable parameters.

What would settle it

A physical four-module network optimized in simulation fails to exceed 80 percent accuracy on the same nonlinear classification task when implemented with real hardware under measured noise levels and component tolerances.

read the original abstract

Photonic neural networks promise ultrafast inference, yet most architectures rely on linear optical meshes with electronic nonlinearities, reintroducing optical-electrical-optical bottlenecks. Here we introduce small-scale photonic Kolmogorov-Arnold networks (SSP-KANs) implemented entirely with standard telecommunications components. Each network edge employs a trainable nonlinear module composed of a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable optical attenuators, providing a four-parameter transfer function derived from gain saturation and interferometric mixing. Despite the constrained functional form of these optical nonlinearities, SSP-KANs comprising only a few optical modules achieve strong nonlinear inference performance across classification, regression, and image recognition tasks, approaching software baselines with significantly fewer parameters. A four-module network achieves $94.3$\% (IQR: $90.3$--$97.4$\%, 10~seeds) accuracy on nonlinear classification benchmarks; a seven-module network attains $R^2 = 0.986 \pm 0.015$ on six-input regression. Performance remains robust under realistic hardware impairments, maintaining high accuracy down to 6-bit input resolution and 14 dB signal-to-noise ratio. By using a fully differentiable physics model for end-to-end optimisation of optical parameters, this work establishes a practical pathway from simulation to experimental demonstration of photonic KANs using commodity telecom hardware.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces small-scale photonic Kolmogorov-Arnold networks (SSP-KANs) implemented with standard telecom components. Each network edge uses a trainable nonlinear module consisting of a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable optical attenuators, yielding a four-parameter transfer function derived from gain saturation and interferometric mixing. Using end-to-end differentiable physics-based optimization, the authors report that networks with only a few modules achieve 94.3% accuracy (IQR 90.3-97.4%, 10 seeds) on nonlinear classification, R² = 0.986 ± 0.015 on six-input regression, and competitive results on image recognition tasks, while maintaining performance under realistic impairments such as 6-bit resolution and 14 dB SNR.

Significance. If the results hold under experimental validation, this work offers a practical route to ultrafast photonic KANs using commodity hardware, avoiding optical-electrical-optical bottlenecks and employing far fewer parameters than conventional photonic neural networks. The fully differentiable physics model for end-to-end optimization and the inclusion of robustness tests with multiple seeds and IQR reporting are clear strengths that enhance reproducibility and experimental feasibility.

major comments (2)
  1. §3 (Nonlinear Module Description): The central claim that the constrained four-parameter optical transfer function supports effective KAN-style learning rests on its functional expressivity. The manuscript does not provide an analysis or visualization of the function family realizable by the MZI+SOA+attenuator module (e.g., range of monotonicity, number of inflection points, or ability to approximate non-monotonic shapes), which is load-bearing for generalizing the 94.3% accuracy and R²=0.986 results beyond the tested benchmarks.
  2. Results section and performance tables: Concrete metrics are reported with IQR and seed counts, yet the text provides no details on simulation validation against physical hardware measurements, error propagation analysis, or explicit baseline comparisons to software KANs or other photonic architectures. This omission undermines verification of the claim that performance approaches software baselines with significantly fewer parameters.
minor comments (2)
  1. Abstract: The phrase 'six-input regression' lacks a brief description of the underlying task or dataset, which would improve clarity for readers.
  2. Figure captions (throughout): Adding explicit labels for the network topologies (e.g., number of modules and connectivity) used in each experiment would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed review. We address each major comment below and indicate the revisions made to the manuscript.

read point-by-point responses
  1. Referee: §3 (Nonlinear Module Description): The central claim that the constrained four-parameter optical transfer function supports effective KAN-style learning rests on its functional expressivity. The manuscript does not provide an analysis or visualization of the function family realizable by the MZI+SOA+attenuator module (e.g., range of monotonicity, number of inflection points, or ability to approximate non-monotonic shapes), which is load-bearing for generalizing the 94.3% accuracy and R²=0.986 results beyond the tested benchmarks.

    Authors: We agree that an explicit analysis of the realizable function family would strengthen the central claim. In the revised manuscript we have added a new subsection to §3 together with a supplementary figure that plots the four-parameter transfer function over the full parameter range. The plots show that the module produces strictly monotonic responses for certain parameter regimes and non-monotonic responses with a single inflection point when the MZI phase shift interacts with SOA saturation; the family remains continuous and differentiable, consistent with the end-to-end optimisation used in the work. These additions directly support the reported benchmark performance. revision: yes

  2. Referee: Results section and performance tables: Concrete metrics are reported with IQR and seed counts, yet the text provides no details on simulation validation against physical hardware measurements, error propagation analysis, or explicit baseline comparisons to software KANs or other photonic architectures. This omission undermines verification of the claim that performance approaches software baselines with significantly fewer parameters.

    Authors: The study is entirely simulation-based using a differentiable physics model of the telecom components. We have now inserted an explicit comparison table in the Results section that reports accuracy and parameter count against both software KAN implementations and representative photonic neural-network architectures from the literature, confirming that the photonic KANs reach within a few percent of software baselines while using orders-of-magnitude fewer trainable parameters. The existing robustness sweeps under 6-bit quantisation and 14 dB SNR already constitute a model-level error-propagation study; we have clarified this point in the text. Direct experimental measurements on fabricated hardware are outside the scope of the present simulation-focused manuscript. revision: partial

standing simulated objections not resolved
  • Direct experimental validation against physical hardware measurements

Circularity Check

0 steps flagged

No significant circularity; performance arises from independent optimization under physics-derived constraints

full rationale

The paper models the nonlinear module via explicit physical equations for gain saturation in the SOA and phase shifts in the MZI, yielding a four-parameter transfer function that is then optimized end-to-end in a differentiable simulator. Reported accuracies (94.3 %) and R² values (0.986) are outputs of this numerical training process on benchmark tasks, not quantities that reduce by construction to the input parameters or to any self-citation. No load-bearing uniqueness theorem, ansatz smuggling, or renaming of known results is present; the functional form is fixed by telecom-component physics rather than fitted post-hoc to the target metrics. The chain therefore remains self-contained and externally falsifiable against software KAN baselines.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Central claim rests on the domain assumption that the constrained optical transfer function is expressive enough for KAN learning and that a differentiable physics model accurately captures hardware behavior for training.

free parameters (1)
  • four-parameter transfer function coefficients
    Trainable parameters of the optical nonlinearity derived from gain saturation and interferometric mixing; values are optimized during end-to-end training.
axioms (1)
  • domain assumption The combined Mach-Zehnder, SOA, and attenuator module produces a differentiable nonlinear response suitable for gradient-based optimization.
    Invoked to enable the fully differentiable physics model for training optical parameters.

pith-pipeline@v0.9.0 · 5781 in / 1278 out tokens · 57089 ms · 2026-05-21T09:24:44.486371+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 1 internal anchor

  1. [1]

    Nature Photonics 15(2), 102–114 (2021) https://doi.org/10.1038/ s41566-020-00754-y 29

    Shastri, B.J., Tait, A.N., Lima, T., Pernice, W.H.P., Bhaskaran, H., Wright, C.D., Prucnal, P.R.: Photonics for artificial intelligence and neuromorphic computing. Nature Photonics 15(2), 102–114 (2021) https://doi.org/10.1038/ s41566-020-00754-y 29

  2. [2]

    Nature Photonics 11 (2017) https://doi.org/10.1038/s41566-017-0058-3

    Zibar, D., Wymeersch, H., Lyubomirsky, I.: Machine learning under the spotlight. Nature Photonics 11 (2017) https://doi.org/10.1038/s41566-017-0058-3

  3. [3]

    Nature Photonics 15, 91 (2021)

    Genty, G., Salmela, L., Dudley, J.M., Brunner, D., Kokhanovskiy, A., Kobtsev, S.M., Turitsyn, S.K.: Machine learning and applications in ultrafast photonics. Nature Photonics 15, 91 (2021)

  4. [4]

    IEEE Journal of Selected Topics in Quantum Electronics 28(4: Mach

    Freire, P.J., Napoli, A., Spinnler, B., Costa, N., Turitsyn, S.K., Prilepsky, J.E.: Neural networks-based equalizers for coherent optical transmission: Caveats and pitfalls. IEEE Journal of Selected Topics in Quantum Electronics 28(4: Mach. Learn. in Photon. Commun. and Meas. Syst.), 7600223 (2022) https://doi.org/ 10.1109/JSTQE.2022.3174268

  5. [5]

    Journal of Lightwave Technology 34(6), 1442–1452 (2016) https://doi.org/10.1109/JLT.2015.2508502

    Zibar, D., Piels, M., Jones, R., Schäeffer, C.G.: Machine learning techniques in optical communication. Journal of Lightwave Technology 34(6), 1442–1452 (2016) https://doi.org/10.1109/JLT.2015.2508502

  6. [6]

    In: Advanced Photonics 2017 (IPR, NOMA, Sensors, Networks, SPPCom, PS), pp

    Khan, F.N., Lu, C., Lau, A.P.T.: Machine learning methods for optical communication systems. In: Advanced Photonics 2017 (IPR, NOMA, Sensors, Networks, SPPCom, PS), pp. 2–3. Optica Publish- ing Group, ??? (2017). https://doi.org/10.1364/SPPCOM.2017.SpW2F.3 . http://opg.optica.org/abstract.cfm?URI=SPPCom-2017-SpW2F.3

  7. [7]

    Freire, P., Manuylovich, E., Prilepsky, J.E., Turitsyn, S.K.: Artificial neural net- works for photonic applications—from algorithms to implementation: tutorial. Adv. Opt. Photon. 15(3), 739–834 (2023) https://doi.org/10.1364/AOP.484119

  8. [8]

    Reports on Progress in Physics 84(1), 012401 (2020) https://doi.org/10.1088/1361-6633/abb4c7

    Piccinotti, D., MacDonald, K.F., Gregory, S.A., Youngs, I., Zheludev, N.I.: Arti- ficial intelligence for photonics and photonic materials. Reports on Progress in Physics 84(1), 012401 (2020) https://doi.org/10.1088/1361-6633/abb4c7

  9. [9]

    Reviews in Physics 12, 100093 (2024) https://doi.org/ 10.1016/j.revip.2024.100093

    Abreu, S., Boikov, I., Goldmann, M., Jonuzi, T., Lupo, A., Masaad, S., Nguyen, L., Picco, E., Pourcel, G., Skalli, A., Talandier, L., Vettelschoss, B., Vlieg, E.A., Argyris, A., Bienstman, P., Brunner, D., Dambre, J., Daudet, L., Domenech, J.D., Fischer, I., Horst, F., Massar, S., Mirasso, C.R., Offrein, B.J., Rossi, A., Soriano, M.C., Sygletos, S., Turit...

  10. [10]

    Nature 588(7836), 39–47 (2020) https://doi.org/10.1038/ s41586-020-2973-6

    Wetzstein, G., Ozcan, A., Gigan, S., Fan, S., Englund, D., Soljačić, M., Denz, C., Miller, D.A.B., Psaltis, D.: Inference in artificial intelligence with deep optics and photonics. Nature 588(7836), 39–47 (2020) https://doi.org/10.1038/ s41586-020-2973-6

  11. [11]

    Nature Reviews Physics 5(12), 717–734 (2023) 30

    McMahon, P.L.: The physics of optical computing. Nature Reviews Physics 5(12), 717–734 (2023) 30

  12. [12]

    Nature 606(7914), 501–506 (2022) https://doi.org/10

    Ashtiani, F., Geers, A.J., Aflatouni, F.: An on-chip photonic deep neural network for image classification. Nature 606(7914), 501–506 (2022) https://doi.org/10. 1038/s41586-022-04714-0

  13. [13]

    : Kolmogorov-arnold network for efficient equalization in short-reach im/dd systems

    Chen, C., Xu, Z., Liu, Y., Wu, Q., Ji, T., Ji, H., Tang, J., Sun, Z., Fan, L., Liang, J., et al. : Kolmogorov-arnold network for efficient equalization in short-reach im/dd systems. Optics Express 33(16), 33139–33152 (2025)

  14. [14]

    https://doi.org/10

    Fischer, R., Matalla, P., Randel, S., Schmalen, L.: Non-linear equalization in 112 Gb/s PONs using Kolmogorov–Arnold networks (2024). https://doi.org/10. 48550/arXiv.2411.19631

  15. [15]

    https://doi.org/10.48550/arXiv.2408.08407

    Peng, Y., Hooten, S., Yu, X., Van Vaerenbergh, T., Yuan, Y., Xiao, X., Tossoun, B., Cheung, S., Fiorentino, M., Beausoleil, R.: Photonic KAN: a Kolmogorov– Arnold network inspired efficient photonic neuromorphic architecture (2024). https://doi.org/10.48550/arXiv.2408.08407

  16. [16]

    https://doi.org/10

    Stroev, N., Berloff, N.G.: Programmable k-local Ising machines and all-optical Kolmogorov–Arnold networks on photonic platforms (2025). https://doi.org/10. 48550/arXiv.2508.17440

  17. [17]

    KAN: Kolmogorov-Arnold Networks

    Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Soljačić, M., Hou, T.Y., Tegmark, M.: KAN: Kolmogorov–Arnold Networks (2025). https://doi.org/10. 48550/arXiv.2404.19756

  18. [18]

    https://archive.ics.uci

    Dua, D., Graff, C.: UCI Machine Learning Repository. https://archive.ics.uci. edu/ml (2017)

  19. [19]

    International Shipbuilding Progress 28(328), 276–297 (1981) https://doi.org/10.3233/ISP-1981-2832801

    Gerritsma, J., Onnink, R., Versluis, A.: Geometry, resistance and stability of the Delft systematic yacht hull series. International Shipbuilding Progress 28(328), 276–297 (1981) https://doi.org/10.3233/ISP-1981-2832801

  20. [20]

    Advances in Optics and Photonics 14(3), 571 (2022) https: //doi.org/10.1364/AOP.451872

    Sobhanan, A., Anthur, A., O’Duill, S., Pelusi, M., Namiki, S., Barry, L., Venkitesh, D., Agrawal, G.P.: Semiconductor optical amplifiers: recent advances and applications. Advances in Optics and Photonics 14(3), 571 (2022) https: //doi.org/10.1364/AOP.451872

  21. [21]

    In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)

    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)

  22. [22]

    IEEE Journal of Quantum Electronics 25(11), 2297–2306 (1989) https://doi.org/10.1109/3.42059

    Agrawal, G.P., Olsson, N.A.: Self-phase modulation and spectral broadening of optical pulses in semiconductor laser amplifiers. IEEE Journal of Quantum Electronics 25(11), 2297–2306 (1989) https://doi.org/10.1109/3.42059 . Accessed 2025-11-18 31