arxiv: 2604.26377 · v1 · submitted 2026-04-29 · 💻 cs.CE · cs.ET

Recognition: unknown

Accelerating Sparse Linear Solvers with an Optical Laser Processing Unit

Dan Gluck , Yotam Mimran , Andrey Karenskih , Talya Vaknin , Omri Wolf , Ruti Ben-Shlomi , Johannes Gebert

Authors on Pith no claims yet

Pith reviewed 2026-05-07 12:35 UTC · model grok-4.3

classification 💻 cs.CE cs.ET

keywords sparse linear systemsoptical analog computinglaser processing unitKrylov subspace methodstime-to-solutionSuiteSparse matriceshybrid computing

0 comments

The pith

An optical laser processing unit can deliver lower time-to-solution than GPU-based methods for certain sparse linear systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper maps general linear systems onto the dynamics of an optical Laser Processing Unit, where the steady-state phases of coupled lasers solve Ax = b. It benchmarks an emulator of this unit against standard Krylov subspace solvers running on GPUs using matrices from the SuiteSparse collection. For selected multi-banded sparse problems, the optical approach shows faster convergence to the solution. A reader would care because linear system solves dominate runtime in many engineering simulations, and analog optical hardware could provide speed and energy gains if the mapping holds. This points toward hybrid computing where optical units handle the core iteration.

Core claim

The LPU encodes the linear system into coupled laser dynamics within an optical cavity, with the solution given by the steady-state phases of the optical fields. Using an emulator, benchmarks on sparse matrices demonstrate significantly lower time-to-solution compared to CG, GMRES, BiCGSTAB and other methods on a modern GPU, for selected problem classes. This establishes the potential of optical analog computing to accelerate iterative linear solvers, particularly for structured or repeatedly solved systems, while noting limitations in scaling and precision.

What carries the argument

The Laser Processing Unit (LPU), which maps the matrix A and vector b of the linear system Ax=b onto the coupling and dynamics of lasers in an optical cavity so that the phases at steady state yield the solution vector.

If this is right

Hybrid optical-digital systems could reduce latency for linear solvers in HPC applications.
Structured sparse matrices benefit most from the inherent parallelism of the optical cavity.
Repeatedly solved systems would gain from the analog computation without digital iteration overhead.
Energy efficiency improves due to the low-power nature of optical processing for these workloads.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar mappings might apply to other physical systems where optimization corresponds to steady-state equilibria.
Error handling for optical precision limits could involve post-processing with digital refinement steps.
Scaling the LPU to larger systems may require new cavity designs or modular optical arrays.
Adoption in practice would depend on integrating the LPU with existing software frameworks for sparse solvers.

Load-bearing premise

The software emulator of the LPU must accurately reproduce the physical dynamics, noise, and precision limits of the actual optical hardware for the performance claims to hold.

What would settle it

Direct execution of the same sparse matrices on the physical Laser Processing Unit hardware, comparing measured time-to-solution and solution accuracy against GPU baselines.

read the original abstract

Solving large, sparse linear systems is a fundamental workload in scientific computing and engineering simulations, often dominating runtime and energy consumption in high-performance computing (HPC) applications. In this work, we explore an alternative computing paradigm based on analog optical processing, implemented through the Laser Processing Unit (LPU). The LPU encodes linear systems into the dynamics of coupled lasers within an optical cavity, where the steady-state phases of the optical fields correspond to the solution of $Ax=b$. We present a mapping of general linear systems, both dense and sparse, onto the LPU architecture and evaluate its performance using representative matrices from the SuiteSparse collection. Using an LPU emulator, we benchmark convergence behavior and time-to-solution for sparse, multi-banded matrices against established Krylov subspace methods (CG, GMRES, BiCGSTAB, and others) executed on a modern GPU platform. Our results demonstrate that the LPU will achieve significantly lower time-to-solution for selected problem classes, highlighting the potential of optical analog computing for accelerating iterative linear solvers. These findings suggest that optical processors such as the LPU will be able to serve as accelerators for linear systems, in particular structured and/or repeatedly solved, offering advantages in latency, parallelism, and energy efficiency. We discuss current limitations, including scaling constraints and precision considerations, and outline directions toward hybrid optical-digital computing systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps sparse linear systems to LPU laser phases and shows emulator speedups over GPU Krylov methods on some SuiteSparse cases, but the gains rest entirely on unvalidated simulation.

read the letter

This paper maps general linear systems, including sparse ones, onto the phase dynamics of coupled lasers in an LPU and benchmarks an emulator against GPU Krylov solvers like CG and GMRES on SuiteSparse matrices. The key new piece is the encoding that turns the steady-state optical fields into the solution vector, with reported gains in time-to-solution for selected multi-banded problems. They handle both dense and sparse cases and discuss how the optical parallelism could cut latency and energy in HPC workloads that solve the same systems repeatedly. The emulator results show convergence curves and timing comparisons, which is a step beyond pure theory. The main weakness is that all the speed claims come from the software emulator without any cross-check on actual hardware. We don't see measurements of phase noise, finite precision effects, or cavity settling times, so it's unclear how well the sim captures real behavior. There's also no detailed derivation of the mapping or error bounds, and the choice of matrices isn't justified beyond being representative. They note scaling and precision limits but don't quantify them much. This work is aimed at people exploring analog optical computing for scientific simulations. A reader interested in new hardware paradigms for linear algebra would get value from the concrete mapping and benchmarks, even if they need to take the performance numbers with a grain of salt until hardware data arrives. I would send it out for peer review. The idea is solid enough to warrant referee input on the mapping and what validation is needed next.

Referee Report

3 major / 2 minor

Summary. The paper proposes a mapping of general (dense and sparse) linear systems Ax=b onto the steady-state phase dynamics of coupled lasers in an optical Laser Processing Unit (LPU). It evaluates the approach via a software emulator on selected SuiteSparse matrices, benchmarking time-to-solution against GPU implementations of Krylov methods (CG, GMRES, BiCGSTAB and others) and claims significantly lower time-to-solution for certain structured sparse problem classes, while discussing limitations and hybrid optical-digital directions.

Significance. If the emulator faithfully captures physical effects and the mapping preserves accuracy, the work could demonstrate a viable path for analog optical accelerators in scientific computing, offering potential gains in latency, parallelism, and energy efficiency for repeatedly solved or structured systems. It contributes concrete benchmarks and a concrete encoding strategy to the emerging area of optical linear algebra.

major comments (3)

[§4] §4 (Emulator benchmarks): All reported convergence curves and time-to-solution comparisons are generated exclusively by a software emulator of the LPU cavity dynamics. No hardware measurements, cross-validation against physical LPU runs, or quantitative error analysis (phase noise, finite precision, settling time) are provided to establish that the emulator reproduces the relevant physical limits for the chosen sparse mappings.
[§3] Mapping description (abstract and §3): The paper states that general sparse systems are encoded into laser couplings whose steady-state phases solve Ax=b, but supplies neither an explicit derivation of the encoding, nor convergence guarantees, nor analysis of how sparsity is preserved or how off-diagonal fill-in is avoided. This leaves the central claim that the LPU achieves lower time-to-solution for “selected problem classes” without a demonstrated mechanism that works for arbitrary sparse structure.
[Results] Table/figure results (e.g., convergence plots): The reported speedups are shown only for a curated subset of SuiteSparse matrices; no systematic study of matrix properties (condition number, bandwidth, eigenvalue distribution) or ablation of the mapping on non-selected matrices is given, making it impossible to assess whether the performance advantage generalizes or reflects problem selection.

minor comments (2)

[Abstract] Abstract: the phrasing “the LPU will achieve” should be qualified to reflect that all data come from an emulator rather than physical hardware.
[§3] Notation: the mapping from matrix entries to laser coupling strengths and the definition of the steady-state phase vector could be stated more explicitly with an additional equation or pseudocode block.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the thorough review and insightful comments on our manuscript. We address each of the major comments in detail below and outline the revisions we plan to make.

read point-by-point responses

Referee: [§4] §4 (Emulator benchmarks): All reported convergence curves and time-to-solution comparisons are generated exclusively by a software emulator of the LPU cavity dynamics. No hardware measurements, cross-validation against physical LPU runs, or quantitative error analysis (phase noise, finite precision, settling time) are provided to establish that the emulator reproduces the relevant physical limits for the chosen sparse mappings.

Authors: We agree that all benchmarks are performed using a software emulator of the LPU. The emulator is based on numerical integration of the coupled laser equations to simulate the phase dynamics. In the revision, we will add a dedicated subsection in §4 providing quantitative error analysis, including bounds on phase noise impact, finite precision effects from the optical components, and estimates of settling time derived from physical parameters of laser cavities. We will also explicitly state that direct hardware validation is beyond the scope of the current work and represents a key direction for future research. This will better establish the emulator's relevance to physical limits. revision: partial
Referee: [§3] Mapping description (abstract and §3): The paper states that general sparse systems are encoded into laser couplings whose steady-state phases solve Ax=b, but supplies neither an explicit derivation of the encoding, nor convergence guarantees, nor analysis of how sparsity is preserved or how off-diagonal fill-in is avoided. This leaves the central claim that the LPU achieves lower time-to-solution for “selected problem classes” without a demonstrated mechanism that works for arbitrary sparse structure.

Authors: In the manuscript, §3 outlines the mapping by associating the off-diagonal elements of A with the coupling strengths between lasers and the right-hand side b with the injection phases or amplitudes. The steady-state phases are shown to satisfy Ax = b through the equilibrium condition of the phase-locked laser array. We will provide a more detailed derivation, starting from the complex amplitude equations for the lasers and arriving at the linear system, in an expanded §3 or a new appendix. Regarding sparsity, the mapping only activates couplings corresponding to non-zero A_ij, which inherently avoids fill-in. We will include this analysis. Although no formal convergence proof is given, the empirical results support the approach for the targeted problems, and we will add a discussion on the absence of general guarantees. revision: yes
Referee: [Results] Table/figure results (e.g., convergence plots): The reported speedups are shown only for a curated subset of SuiteSparse matrices; no systematic study of matrix properties (condition number, bandwidth, eigenvalue distribution) or ablation of the mapping on non-selected matrices is given, making it impossible to assess whether the performance advantage generalizes or reflects problem selection.

Authors: The selected matrices from SuiteSparse were chosen to illustrate the potential for structured sparse problems, such as those with banded or multi-banded patterns that align well with the LPU's parallel coupling architecture. To address the concern, the revised manuscript will incorporate results for additional matrices spanning a range of properties, including varying condition numbers, bandwidths, and eigenvalue spectra. We will add a table or figure analyzing how these properties correlate with the observed speedups and include cases where the mapping does not yield advantages, providing an ablation-like study. This will clarify the scope of the performance claims. revision: yes

standing simulated objections not resolved

Providing hardware measurements and cross-validation with physical LPU hardware, as the study is conducted entirely via software emulation.

Circularity Check

0 steps flagged

No circularity in derivation; performance claims rest on external emulator benchmarks

full rationale

The paper defines a mapping from Ax=b to steady-state laser phases in the LPU and then measures time-to-solution via a software emulator run on SuiteSparse matrices, comparing directly to GPU Krylov solvers (CG, GMRES, etc.). No step reduces by construction to its own inputs: the mapping is an explicit encoding whose correctness is asserted from the physics model rather than tautologically assumed, the emulator outputs are not fitted parameters renamed as predictions, and no self-citation chain or uniqueness theorem is invoked to force the result. The derivation chain is therefore self-contained against the stated external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the LPU mapping and emulator are treated as given without stated assumptions or fitted constants.

pith-pipeline@v0.9.0 · 5565 in / 1176 out tokens · 45611 ms · 2026-05-07T12:35:58.412787+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references · 10 canonical work pages

[1]

Belostotski, Leonid, Asif Uddin, Arjuna Madanayake, and Soumyajit Mandal. 2025. A Survey of Analog Computing for Domain -Specific Accelerators. Electronics 14, no. 16: 3159. https://doi.org/10.3390/electronics14163159

work page doi:10.3390/electronics14163159 2025
[2]

Huang, N

Y. Huang, N. Guo, M. Seok, Y. Tsividis and S. Sethumadhavan, 2017. Analog Computing in a Modern Context: A Linear Algebra Accelerator Case Study. IEEE Micro, vol. 37, no. 03, pp. 30-38 (May.-Jun. 2017), doi: 10.1109/MM.2017.55

work page doi:10.1109/mm.2017.55 2017
[3]

Narayanan, and Sachin B

Yogesh Dilip Save, H. Narayanan, and Sachin B. Patkar. 2011. Solution of Partial Differential Equations by electrical analogy . J Comput Sci 2, 1 (March 2011), 18–30. https://doi.org/10.1016/J.JOCS.2010.12.006

work page doi:10.1016/j.jocs.2010.12.006 2011
[4]

Chene Tradonsky, Omri Wolf, Talya Vaknin, Dan Gluck, and Dov Furman. 2025. Solving Partial Differential Equations on an Analo g, Optical Platform. Invited Paper. In Proceedings of the 22nd ACM International Conference on Computing Frontiers: Workshops and S pecial Sessions (CF '25 Companion). Association for Computing Machinery, New York, NY, USA, 130 –13...

work page doi:10.1145/3706594.3729406 2025
[5]

Friesem, Oren Raz, and Nir Davidson

Chene Tradonsky, Igor Gershenzon, Vishwa Pal, Ronen Chriki, Asher A. Friesem, Oren Raz, and Nir Davidson. 2019. Rapid laser s olver for the phase retrieval problem. Sci Adv 5, 10 (October 2019), eaax4530. https://doi.org/10.1126/sciadv.aax4530

work page doi:10.1126/sciadv.aax4530 2019
[6]

Tradonsky, R

C. Tradonsky, R. Chriki, G. Barach, V. Pal, A.A. Friesem, and N. Davidson. 2017. Digital degenerate cavity laser. In Optics InfoBase Conference Papers, 2017. . https://doi.org/10.1364/FIO.2017.FTu4C.4

work page doi:10.1364/fio.2017.ftu4c.4 2017
[7]

Yao Xiao, Zhitao Cheng, Shengping Liu, Yicheng Zhang, He Tang, Yong Tang. 2024. PhotoSolver: A bidirectional photonic solver for systems of linear equations. Optics and Lasers in Engineering, Volume 183, 2024, 108524, ISSN 0143 -8166, https://doi.org/10.1016/j.optlaseng.2024.108524

work page doi:10.1016/j.optlaseng.2024.108524 2024
[8]

Davis and Yifan Hu

Timothy A. Davis and Yifan Hu. 2011. The University of Florida Sparse Matrix Collection. ACM Transactions on Mathematical Sof tware 38, 1, Article 1 (December 2011), 25 pages. DOI: https://doi.org/10.1145/2049662.2049663

work page doi:10.1145/2049662.2049663 2011
[9]

Kolodziej, Mohsen Aznaveh, Matthew Bullock, Jarrett David, Timothy A

Scott P. Kolodziej, Mohsen Aznaveh, Matthew Bullock, Jarrett David, Timothy A. Davis, Matthew Henderson, Yifan Hu, and Read S andstrom. 2019. The SuiteSparse Matrix Collection Website Interface. Journal of Open Source Software 4, 35 (March 2019), 1244 -1248. DOI: https://doi.org/10.21105/joss.01244

work page doi:10.21105/joss.01244 2019
[10]

Quintana-Ortí

Hartwig Anzt, Terry Cojean, Goran Flegar, Fritz Göbel, Thomas Grützmacher, Pratik Nayak, Tobias Ribizel, Yuhsiang Mike Tsai, and Enrique S. Quintana-Ortí
[11]

Available: https://doi.org/10.1145/3480935

Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing. ACM Trans. M ath. Softw. 48, 1, Article 2 (March 2022), 33 pages. https://doi.org/10.1145/3480935

work page doi:10.1145/3480935 2022