arxiv: 2605.13224 · v1 · submitted 2026-05-13 · ⚛️ physics.optics

Recognition: unknown

On-chip 1 TOPS Hyperdimensional Photonic Tensor Core using a WDM Silicon Photonic Coherent Crossbar

S. Kovaios , I. Roumpos , A. Tsakyridis , M. Moralis-Pegios , D. Lazovsky , K. Vyrsokinos , N. Pleros

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:52 UTC · model grok-4.3

classification ⚛️ physics.optics

keywords photonic tensor coresilicon photonicswavelength division multiplexinghyperdimensional computingcrossbar architectureoptical computingtensor multiplicationAI accelerator

0 comments

The pith

A silicon photonic crossbar achieves 0.96 TOPS for hyperdimensional tensor computations using time-space-wavelength multiplexing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates an experimental on-chip photonic tensor core that performs tensor-vector multiplications by serializing operations over time while distributing workload across spatial and wavelength channels in a silicon photonic crossbar. This TSWDM architecture enables compact integration with electroabsorption modulators and achieves 0.96 TOPS throughput with 3.9 percent average error in a 4x2x1 unit. When tested on the Iris dataset classification task it reaches 93.3 percent accuracy at data rates up to 30 GBd and remains functional at 60 GBd. The authors show that adding wavelength division multiplexing reduces required laser power and outline a path toward POPS-scale photonic accelerators.

Core claim

The authors demonstrate a 4-channel 2-input TSWDM Xbar incorporating 56 GHz EAMs and 4-channel multiplexing stages that functions as a 4x2x1 tensor-vector multiplication unit with 3.9 percent average error at 0.96 TOPS throughput; the same hardware delivers 93.3 percent accuracy on Iris classification at 4x10 to 4x30 GBd and 83.3 percent at 4x60 GBd, while WDM integration in the SDM architecture lowers operating laser power and supports scaling toward POPS-regime accelerators.

What carries the argument

The time-space-wavelength multiplexed (TSWDM) silicon photonic coherent crossbar, which unfolds multiply-accumulate operations over the time domain while distributing computation across spatial and wavelength channels.

Load-bearing premise

That the 4x2x1 unit performance and error rates will hold when scaling to larger arrays and higher channel counts without significant additional noise, crosstalk, or power penalties.

What would settle it

Fabricate and measure a scaled prototype with at least 8 spatial channels and 4 wavelength channels while recording whether average multiplication error stays below 5 percent at the projected higher data rates.

Figures

Figures reproduced from arXiv: 2605.13224 by A. Tsakyridis, D. Lazovsky, I. Roumpos, K. Vyrsokinos, M. Moralis-Pegios, N. Pleros, S. Kovaios.

**Figure 2.** Figure 2: (a) The L×M×N TSWDM Xbar layout, supporting time, space and wavelength division multiplexing. (b) Operation of a single TSDM Xbar for arbitrary MVM computations (c) Operation of the TSWDM Xbar as a TVM engine [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: (a) The 2×1 TWDM Xbar. (b) Fabricated SiPho chi [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 5.** Figure 5: MVM operations under injection of four different channels, depicting an average error of 3.9% [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 7.** Figure 7: Bit resolution and single laser power for optimal performance [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗

read the original abstract

We demonstrate an on-chip 0.96 TOPS hyperdimensional photonic tensor core by utilizing a time-spacewavelength multiplexed silicon photonic Crossbar (Xbar). The novel architecture relies on serializing the large matrix-vector or tensor-vector products by unfolding multiply and accumulation operations over time domain, while simultaneously distributing the computational workload over different spatial and wavelength channels. We experimentally demonstrate the operation of a 4-channel 2-input TSWDM Xbar that incorporates 56 GHz electroabsorption modulators (EAMs) and 4-channel integrated multiplexing stages. Its successful operation as a 4x2x1 tensorvector multiplication unit demonstrated an average error of 3.9%. Its performance as a photonic AI accelerator was also evaluated in the classification task of the Iris dataset, presenting experimental accuracies of 93.3% at data rates between 4x10 and 4x30 GBd, reaching 83.3% when the data rate increases to 4x60 GBd. Finally, we discuss the TSWDM Xbar scalability potential, revealing that the inclusion of a WDM scheme in the SDM architecture reduces the operating laser power, feasibly boosting the potential of constructing photonic accelerators with computational throughput in the POPS regime.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper delivers measured results on a working 4x2x1 TSWDM photonic crossbar for hyperdimensional computing at 0.96 TOPS, but the path to POPS scaling rests on untested assumptions about noise and crosstalk.

read the letter

The paper demonstrates a functional 4x2x1 silicon photonic crossbar using time-space-wavelength multiplexing for hyperdimensional tensor operations, hitting 0.96 TOPS with 3.9% average error and 93.3% accuracy on the Iris dataset at moderate data rates. What stands out is the experimental work on actual fabricated hardware. They integrated 56 GHz EAMs and multiplexing stages, then tested the unit for matrix-vector multiplications and ran a classification task. The numbers are concrete and come from measurements, not simulations. The architecture serializes large operations over time while using spatial and wavelength channels in parallel. This lets them handle higher effective dimensions without a huge physical array. The discussion on using WDM to cut laser power and reach POPS throughput is interesting on paper. The weak part is the scaling argument. They show results for 4 channels and note accuracy falling to 83.3% at 60 GBd, but there's no data on crosstalk or noise buildup when adding more channels or inputs. The claim that this path leads to practical POPS accelerators rests on assumptions about how well the small unit extrapolates, and those aren't backed by additional tests. This is useful for people working on photonic AI hardware who want to see a real prototype for hyperdimensional computing. It engages with the literature on silicon photonics and crossbars without obvious contradictions in the reported results. I'd bring it to a reading group for the experimental details and send it out for peer review. The measurements give it enough substance to warrant referee input on the scalability claims.

Referee Report

2 major / 2 minor

Summary. The manuscript experimentally demonstrates a 4-channel 2-input time-space-wavelength multiplexed (TSWDM) silicon photonic crossbar as a hyperdimensional tensor core, achieving 0.96 TOPS throughput via 4x60 GBd serialization, with 3.9% average error on tensor-vector multiplications and Iris classification accuracies of 93.3% (10-30 GBd) dropping to 83.3% at 60 GBd. It discusses scalability to POPS regimes by incorporating WDM to reduce laser power requirements.

Significance. The direct hardware measurements on a fabricated 4x2x1 unit provide concrete, reproducible performance metrics (error rate and dataset accuracy) that support the small-scale tensor core operation. This strengthens the case for photonic accelerators in AI tasks if the multiplexing approach can be extended without prohibitive noise penalties.

major comments (2)

[Scalability discussion] Scalability discussion (final section): The claim that WDM inclusion in the SDM architecture feasibly enables POPS-regime accelerators assumes crosstalk, phase noise, and power penalties do not accumulate prohibitively beyond the demonstrated 4 WDM channels and 2 inputs. No measurements or quantitative simulations of noise scaling for larger arrays are provided, despite the observed accuracy degradation at 60 GBd indicating rate-dependent effects that would likely compound in hyperdimensional configurations requiring higher effective dimensionality.
[Experimental results] Experimental results section: The hyperdimensional tensor core claim rests on achieving high effective dimensionality through time multiplexing plus spatial/WDM scaling, yet only the 4x2x1 unit is fabricated and tested. The manuscript does not report how the demonstrated serialization maintains tensor operation fidelity when unfolding larger matrix-vector products, which is load-bearing for extending the 0.96 TOPS result to true hyperdimensional operation.

minor comments (2)

[Abstract] Abstract: The title states '1 TOPS' while the text reports 0.96 TOPS; include a brief note on the exact calculation (e.g., from 4x60 GBd serialization) to avoid minor inconsistency.
[Figures] Figure clarity: Ensure all experimental traces (e.g., error vs. data rate) include error bars or multiple runs to quantify measurement repeatability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and valuable feedback on our manuscript. We have addressed each of the major comments in detail below and will incorporate revisions to strengthen the paper.

read point-by-point responses

Referee: [Scalability discussion] Scalability discussion (final section): The claim that WDM inclusion in the SDM architecture feasibly enables POPS-regime accelerators assumes crosstalk, phase noise, and power penalties do not accumulate prohibitively beyond the demonstrated 4 WDM channels and 2 inputs. No measurements or quantitative simulations of noise scaling for larger arrays are provided, despite the observed accuracy degradation at 60 GBd indicating rate-dependent effects that would likely compound in hyperdimensional configurations requiring higher effective dimensionality.

Authors: We agree that a more detailed analysis of noise scaling is necessary to support the scalability claims. In the revised version, we will add quantitative simulations of crosstalk and phase noise accumulation for arrays with up to 16 WDM channels, based on the experimental parameters measured in our 4-channel device. This will include an assessment of how the rate-dependent effects observed at 60 GBd impact larger hyperdimensional computations. revision: yes
Referee: [Experimental results] Experimental results section: The hyperdimensional tensor core claim rests on achieving high effective dimensionality through time multiplexing plus spatial/WDM scaling, yet only the 4x2x1 unit is fabricated and tested. The manuscript does not report how the demonstrated serialization maintains tensor operation fidelity when unfolding larger matrix-vector products, which is load-bearing for extending the 0.96 TOPS result to true hyperdimensional operation.

Authors: The experimental demonstration focuses on the fundamental 4x2x1 tensor-vector multiplication unit, which validates the TSWDM approach. The serialization unfolds the larger operations over time, and the measured 3.9% average error confirms the fidelity of individual multiply-accumulate steps. For larger products, the overall error would accumulate based on the number of operations, but the per-step fidelity remains as demonstrated. We will revise the manuscript to include a detailed explanation of the unfolding process and an analysis of error propagation for larger hyperdimensional vectors. revision: partial

Circularity Check

0 steps flagged

No circularity: experimental metrics obtained from direct hardware measurements on fabricated 4x2x1 TSWDM crossbar

full rationale

The manuscript reports measured throughput (0.96 TOPS), average error (3.9%), and classification accuracy (93.3% at 10-30 GBd, 83.3% at 60 GBd) on a physically realized 4-channel 2-input silicon photonic device incorporating 56 GHz EAMs and integrated WDM stages. These figures are obtained by direct experimental characterization rather than by any derivation, fitting procedure, or self-referential equation that reduces to its own inputs. The scalability discussion (WDM reducing laser power for POPS potential) is qualitative and does not invoke fitted parameters, self-citations, or uniqueness theorems that would force the reported results. No load-bearing ansatz, renaming of known results, or self-definitional steps appear in the presented chain.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The demonstration rests on standard silicon photonics integration assumptions and experimental calibration of modulator and multiplexer performance; no new entities or fitted constants are introduced beyond the measured hardware parameters.

free parameters (1)

data rate
Experimental operating point varied between 10 and 60 GBd to measure accuracy drop-off.

axioms (1)

domain assumption Silicon photonic components can be monolithically integrated with electroabsorption modulators and wavelength multiplexers at the stated speeds.
Invoked to justify the on-chip 4-channel TSWDM Xbar fabrication and operation.

pith-pipeline@v0.9.0 · 5563 in / 1314 out tokens · 31640 ms · 2026-05-14T18:52:41.629049+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Photonic multiplexing techniques for neuromorphic computing

doi: 10.1016/j.joule.2023.09.004 [2]. Y. Bai et al., "Photonic multiplexing techniques for neuromorphic computing" Nanophotonics, vol. 12, no. 5, 2023, pp. 795-817. doi: 10.1515/nanoph-2022-0485 [3]. B.J. Shastri et al. Photonics for artificial intelligence and neuromorphic computing. Nat. Photonics 15, 102–114, Apr

work page doi:10.1016/j.joule.2023.09.004 2023
[2]

Roadmap on Neuromorphic Photonics,

doi: 10.1038/s41566-020-00754-y [4]. D. Brunner et al., “Roadmap on Neuromorphic Photonics,” arxiv.org,

work page doi:10.1038/s41566-020-00754-y
[3]

Femtojoule per MAC Neuromorphic Photonics: An Energy and Technology Roadmap,

doi: 10.48550/arXiv.2501.07917 [5]. A. R. Totović, G. Dabos, N. Passalis, A. Tefas and N. Pleros, "Femtojoule per MAC Neuromorphic Photonics: An Energy and Technology Roadmap," in IEEE Journal of Selected Topics in Quantum Electronics, vol. 26, no. 5, pp. 1-15, Sept.-Oct. 2020, Art no. 8800115, doi: 10.1109/JSTQE.2020.2975579. [6]. A. Tsakyridis et al.,” ...

work page doi:10.48550/arxiv.2501.07917 2020