pith. sign in

arxiv: 2606.24933 · v1 · pith:IZREDP2Pnew · submitted 2026-06-22 · 🪐 quant-ph · cs.AI· cs.ET· cs.LG· cs.NE

Self-Modulating Quantum Fast-Weight Programmers for Efficient Adaptive Sequential Learning

Pith reviewed 2026-06-26 08:08 UTC · model grok-4.3

classification 🪐 quant-ph cs.AIcs.ETcs.LGcs.NE
keywords quantum machine learningsequential learningfast weight programmersself-modulationtime seriesquantum circuitsadaptive learning
0
0 comments X

The pith

Self-modulating quantum fast-weight programmers improve convergence stability and prediction accuracy on sequential tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends quantum fast weight programmers with a self-modulation layer that adaptively scales both incoming fast-weight updates and stored historical memory. This change is intended to maintain a workable balance between incorporating fresh sequence information and preserving earlier context during training. Experiments across different qubit counts and sequence lengths report steadier convergence and higher prediction scores than the unmodulated baseline. Theoretical reasoning links the modulation rule to improved propagation of temporal dependencies through the model. The overall result positions the approach as a compact quantum architecture for time-series processing.

Core claim

Self-Modulating Quantum Fast Weight Programmers introduce adaptive modulation that simultaneously controls the strength of new fast-weight updates and the decay rate of stored fast-weight memory. When this modulation is active, the model exhibits more stable training trajectories and stronger predictive performance on sequential data, with the improvement holding across variations in qubit number and input length. The authors supply theoretical arguments that the modulation rule achieves a controlled trade-off between new information injection and memory retention, thereby supporting longer-range temporal information flow.

What carries the argument

The self-modulation mechanism, which applies learned or rule-based scaling factors to both newly generated fast-weight updates and the retained historical fast-weight matrix.

If this is right

  • Training curves become more stable without increasing model size.
  • Prediction accuracy improves on time-series inputs of varying length.
  • The same architecture works across different qubit counts without retuning.
  • Temporal dependencies propagate farther through the quantum circuit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar modulation logic could be tested in other quantum recurrent or memory-augmented circuits.
  • The approach may allow smaller qubit registers to achieve performance previously requiring more qubits.
  • Classical fast-weight models could adopt an analogous modulation step for comparison.

Load-bearing premise

The claimed balance between new information injection and memory retention produced by the modulation rule is what actually drives the observed numerical gains.

What would settle it

A controlled experiment on the same sequential tasks that applies the self-modulation rule but records no gain, or a loss, in convergence speed or final prediction accuracy relative to the unmodified quantum fast-weight programmer.

Figures

Figures reproduced from arXiv: 2606.24933 by Chen-Yu Liu, Chun-Hua Lin, Hsin-Yi Lin, Huan-Hsin Tseng, Jiun-Cheng Jiang, Junghoon Justin Park, Kuan-Cheng Chen, Kuo-Chung Peng, Samuel Yen-Chi Chen, Shinjae Yoo, Yifeng Peng.

Figure 1
Figure 1. Figure 1: Architecture of the proposed Self-Modulating QFWP. A classical controller generates ∆t, Mold t , and Mnew t , which are fused to produce the circuit parameters Θt for the variational quantum circuit W(Θt). A. Full Self-Modulating QFWP The full Self-Modulating QFWP, as illustrated in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Prediction trajectories of Standard QFWP and full Self￾Modulating QFWP on the bessel_j2 task at selected training epochs (seq_len=32). Blue solid lines denote model predictions, red dashed lines denote ground truth, and orange dashed lines indicate the train/test split. fails to capture the oscillatory structure of the target sequence, whereas the Self-Modulating variant already aligns reasonably well with… view at source ↗
Figure 4
Figure 4. Figure 4: Final test MSE of Standard QFWP, Self-Modulating QFWP, and its ablation variants on the bessel_j2 task across hidden sizes and sequence lengths. Each cell reports the final test MSE for the corresponding configuration. further consolidate the trends suggested by the epoch-wise prediction plots and representative convergence curves. For the bessel_j2 task, the standard QFWP maintains low error only in short… view at source ↗
Figure 5
Figure 5. Figure 5 [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Representative test MSE convergence curves on the damped_shm task, comparing Standard QFWP, full Self-Modulating QFWP, and its ablation variants under selected hidden sizes and sequence lengths. The representative test MSE convergence curves for damped_shm ( [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
Figure 10
Figure 10. Figure 10: Prediction trajectories of Standard QFWP and Self-Modulating QFWP on the delayed_quantum_control task at selected training epochs (seq_len=64). Blue solid lines denote model predictions, red dashed lines denote ground truth, and orange dashed lines indicate the train/test split. For the delayed_quantum_control task with se￾quence length 64 (shown in [PITH_FULL_IMAGE:figures/full_fig_p006_10.png] view at source ↗
Figure 9
Figure 9. Figure 9: Relative improvement over Standard QFWP on the damped_shm task for Self-Modulating QFWP and its ablation variants across hidden sizes and sequence lengths. Positive values indicate improvement over the standard baseline, while negative values indicate degradation. The relative-improvement heatmaps in Figure9 further show that both the Self-Modulating and Only-Old variants achieve strong positive gains over… view at source ↗
Figure 13
Figure 13. Figure 13: Relative improvement over Standard QFWP on the delayed_quantum_control task for Self-Modulating QFWP and its ablation variants across hidden sizes and sequence lengths. Positive values indicate improvement over the standard baseline, while negative values indicate degradation. that the dominant gain in delayed_quantum_control comes from old-related modulation, while full self-modulation preserves this adv… view at source ↗
Figure 11
Figure 11. Figure 11: Representative test MSE convergence curves on the delayed_quantum_control task, comparing Standard QFWP, full Self-Modulating QFWP, and its ablation variants under selected hidden sizes and sequence lengths [PITH_FULL_IMAGE:figures/full_fig_p007_11.png] view at source ↗
Figure 14
Figure 14. Figure 14: Prediction trajectories of Standard QFWP and Self-Modulating QFWP on the narma_5 task at selected training epochs (seq_len=32). Blue solid lines denote model predictions, red dashed lines denote ground truth, and orange dashed lines indicate the train/test split. We next examine the NARMA family—narma_5 (sequence length 32) and narma_10 (sequence length 64)—as progres￾sively harder autoregressive memory t… view at source ↗
Figure 15
Figure 15. Figure 15: Representative test MSE convergence curves on the narma_5 task task, comparing Standard QFWP, full Self-Modulating QFWP, and its ablation variants under selected hidden sizes and sequence lengths [PITH_FULL_IMAGE:figures/full_fig_p008_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Final test MSE of Standard QFWP, Self-Modulating QFWP, and its ablation variants on the narma_5 task across hidden sizes and sequence lengths. Each cell reports the final test MSE for the corresponding configuration [PITH_FULL_IMAGE:figures/full_fig_p008_16.png] view at source ↗
Figure 19
Figure 19. Figure 19: Representative test MSE convergence curves on the narma_10 task, comparing Standard QFWP, full Self-Modulating QFWP, and its ablation variants under selected hidden sizes and sequence lengths. The convergence curves show that all variants can eventu￾ally reach relatively low test error, but the standard QFWP [PITH_FULL_IMAGE:figures/full_fig_p008_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Final test MSE of Standard QFWP, Self-Modulating QFWP, and its ablation variants on the narma_10 task across hidden sizes and sequence lengths. Each cell reports the final test MSE for the corresponding configuration [PITH_FULL_IMAGE:figures/full_fig_p009_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Relative improvement over Standard QFWP on the narma_10 task for Self-Modulating QFWP and its ablation variants across hidden sizes and sequence lengths. Positive values indicate improvement over the standard baseline, while negative values indicate degradation. exhibits greater instability and occasional spikes in harder settings, whereas the Self-Modulating and Only-Old variants remain more stable overa… view at source ↗
Figure 22
Figure 22. Figure 22: Task-wise summary heatmaps of relative strength and synergy across hidden sizes and sequence lengths for the five benchmark tasks. Numeric annotations report the corresponding summary scores for each configuration [PITH_FULL_IMAGE:figures/full_fig_p010_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Task-wise mean final test MSE of Standard QFWP, Self￾Modulating QFWP, and its ablation variants, providing an aggregate comparison of overall predictive performance across configurations. Lower values indicate better performance. c) Difference Between Full and Only-Old.: With dt,j and at,j fixed, only ct,j differs; this is a structural, not parameter-wise, comparison. Define et,j = θ full t,j − θ old t,j … view at source ↗
read the original abstract

Recent advances in quantum machine learning have motivated efficient models for sequential data processing. In this paper, we propose Self-Modulating Quantum Fast Weight Programmers, or Self-Modulating QFWP, which extends Quantum Fast Weight Programmers by introducing adaptive modulation over both newly generated fast-weight updates and historical fast-weight memory. Numerical results show that the proposed mechanism improves convergence stability and prediction performance across varying model settings, including different numbers of qubits and input sequence lengths. We further provide theoretical arguments explaining how self-modulation balances new information injection with memory retention, thereby enhancing temporal information propagation. These results suggest that Self-Modulating QFWP is a compact and effective framework for quantum machine learning on time-series data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper proposes Self-Modulating Quantum Fast-Weight Programmers (Self-Modulating QFWP) as an extension of Quantum Fast Weight Programmers. The extension introduces adaptive modulation applied to both newly generated fast-weight updates and historical fast-weight memory. Numerical results are reported to show improved convergence stability and prediction performance across varying numbers of qubits and input sequence lengths. Theoretical arguments are provided to explain how the self-modulation balances new information injection with memory retention to enhance temporal information propagation.

Significance. If the numerical improvements and theoretical arguments hold under scrutiny, the work offers a compact framework for quantum machine learning on sequential/time-series data. The combination of claimed empirical gains with an explanatory mechanism for stability is a positive feature when the derivations and controls are fully specified.

minor comments (3)
  1. [Abstract] Abstract: the claim of numerical improvements in stability and performance is stated without any quantitative values, error bars, baseline comparisons, or model hyperparameters; this makes it difficult for readers to gauge the magnitude of the reported gains.
  2. The manuscript should include explicit definitions or pseudocode for the self-modulation operator and the fast-weight update rule so that the theoretical balancing argument can be directly verified against the implementation.
  3. Figure and table captions should explicitly state the number of independent runs, random seeds, and statistical tests used to support the stability and performance claims.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were listed in the report, so we have no points requiring point-by-point response or manuscript changes at this stage.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces Self-Modulating QFWP as an extension of prior QFWP models, supported by numerical experiments across qubit counts and sequence lengths plus separate theoretical arguments on information balance. No equations, derivations, or parameter-fitting steps are exhibited in the manuscript that reduce a claimed prediction or uniqueness result to a self-definition, fitted input, or self-citation chain by construction. The central performance claims rest on external empirical benchmarks rather than internal re-labeling of inputs, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities can be extracted from the provided text.

pith-pipeline@v0.9.1-grok · 5707 in / 948 out tokens · 13571 ms · 2026-06-26T08:08:08.548476+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

75 extracted references · 24 canonical work pages

  1. [1]

    M. A. Nielsen and I. L. Chuang,Quantum Computation and Quantum Information, 10th ed. Cambridge University Press, 2010. [Online]. Available: https://doi.org/10.1017/CBO9780511976667

  2. [2]

    Algorithms for quantum computation: discrete logarithms and factoring,

    P. W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” inProceedings 35th annual symposium on foundations of computer science. Ieee, 1994, pp. 124–134

  3. [3]

    A fast quantum mechanical algorithm for database search,

    L. K. Grover, “A fast quantum mechanical algorithm for database search,” inProceedings of the twenty-eighth annual ACM symposium on Theory of computing, 1996, pp. 212–219

  4. [4]

    and Lloyd, S

    J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, “Quantum machine learning,”Nature, vol. 549, no. 7671, pp. 195–202, 2017. [Online]. Available: https://doi.org/10.1038/nature23474

  5. [5]

    Quantum artificial intelligence: From quantum neural networks to self-programming architectures [feature],

    S. Y .-C. Chen, “Quantum artificial intelligence: From quantum neural networks to self-programming architectures [feature],”IEEE Circuits and Systems Magazine, vol. 26, no. 1, pp. 41–66, 2026

  6. [6]

    Delgado and K

    A. Delgado and K. E. Hamilton,Quantum Machine Learning: Concepts and possibilities. IOP Publishing, 2025

  7. [7]

    Quantum machine learning in feature hilbert spaces,

    M. Schuld and N. Killoran, “Quantum machine learning in feature hilbert spaces,”Physical review letters, vol. 122, no. 4, p. 040504, 2019

  8. [8]

    Parameterized quantum circuits as machine learning models,

    M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini, “Parameterized quantum circuits as machine learning models,”Quantum science and technology, vol. 4, no. 4, p. 043001, 2019

  9. [9]

    The power of quantum neural networks,

    A. Abbas, D. Sutter, C. Zoufal, A. Lucchi, A. Figalli, and S. Woerner, “The power of quantum neural networks,”Nature computational science, vol. 1, no. 6, pp. 403–409, 2021

  10. [10]

    Expressibility and entan- gling capability of parameterized quantum circuits for hybrid quantum- classical algorithms,

    S. Sim, P. D. Johnson, and A. Aspuru-Guzik, “Expressibility and entan- gling capability of parameterized quantum circuits for hybrid quantum- classical algorithms,”Advanced Quantum Technologies, vol. 2, no. 12, p. 1900070, 2019

  11. [11]

    Noisy intermediate-scale quantum algorithms,

    K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, S. Alperin- Lea, A. Anand, M. Degroote, H. Heimonen, J. S. Kottmann, T. Menke, W.-K. Mok, S. Sim, L.-C. Kwek, and A. Aspuru-Guzik, “Noisy intermediate-scale quantum algorithms,”Reviews of Modern Physics, vol. 94, no. 1, p. 015004, 2022. [Online]. Available: https://doi.org/10.1103/RevModPhys.94.015004

  12. [12]

    Cerezo , author A

    M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, and P. J. Coles, “Variational quantum algorithms,”Nature Reviews Physics, vol. 3, no. 9, pp. 625–644, 2021. [Online]. Available: https://doi.org/10.1038/s42254-021-00348-9

  13. [13]

    A variational eigenvalue solver on a photonic quantum processor,

    A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’brien, “A variational eigenvalue solver on a photonic quantum processor,”Nature communications, vol. 5, no. 1, p. 4213, 2014

  14. [14]

    Noise-induced barren plateaus in variational quantum algorithms,

    S. Wang, E. Fontana, M. Cerezo, K. Sharma, A. Sone, L. Cincio, and P. J. Coles, “Noise-induced barren plateaus in variational quantum algorithms,”Nature communications, vol. 12, no. 1, p. 6961, 2021

  15. [15]

    Mitarai, M

    K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, “Quantum circuit learning,”Physical Review A, vol. 98, no. 3, p. 032309, 2018. [Online]. Available: https://doi.org/10.1103/PhysRevA.98.032309

  16. [16]

    Franke, J

    M. Schuld, A. Bocharov, K. M. Svore, and N. Wiebe, “Circuit- centric quantum classifiers,”Physical Review A, vol. 101, no. 3, p. 032308, 2020. [Online]. Available: https://doi.org/10.1103/PhysRevA. 101.032308

  17. [17]

    Transformer-based multi-aspect multi-granularity non-native english speaker pronunciation assessment,

    S. Y .-C. Chen, S. Yoo, and Y .-L. L. Fang, “Quantum long short-term memory,” inICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 8622–8626. [Online]. Available: https://doi.org/10.1109/ICASSP43922. 2022.9747369

  18. [18]

    Variational quantum circuits for deep reinforcement learning,

    S. Y .-C. Chen, C.-H. H. Yang, J. Qi, P.-Y . Chen, X. Ma, and H.-S. Goan, “Variational quantum circuits for deep reinforcement learning,” IEEE access, vol. 8, pp. 141 007–141 024, 2020. [Online]. Available: https://doi.org/10.1109/ACCESS.2020.3010470

  19. [19]

    Quantum convolutional neural network for classical data classification,

    T. Hur, L. Kim, and D. K. Park, “Quantum convolutional neural network for classical data classification,”Quantum Machine Intelligence, vol. 4, no. 1, p. 3, 2022

  20. [20]

    A variational algorithm for quantum single layer perceptron,

    A. Macaluso, F. Orazi, M. Klusch, S. Lodi, and C. Sartori, “A variational algorithm for quantum single layer perceptron,” inInternational Confer- ence on Machine Learning, Optimization, and Data Science. Springer, 2022, pp. 341–356

  21. [21]

    Cutting is all you need: Execution of large-scale quantum neural networks on limited- qubit devices,

    A. Marchisio, E. Sychiuco, M. Kashif, and M. Shafique, “Cutting is all you need: Execution of large-scale quantum neural networks on limited- qubit devices,” in2025 IEEE International Conference on Quantum Artificial Intelligence (QAI). IEEE, 2025, pp. 330–336

  22. [22]

    Time-series quantum reservoir computing with weak and pro- jective measurements,

    P. Mujal, R. Mart ´ınez-Pe˜na, G. L. Giorgi, M. C. Soriano, and R. Zam- brini, “Time-series quantum reservoir computing with weak and pro- jective measurements,”npj Quantum Information, vol. 9, no. 1, p. 16, 2023

  23. [23]

    Optimizing a quantum reservoir computer for time series prediction,

    A. Kutvonen, K. Fujii, and T. Sagawa, “Optimizing a quantum reservoir computer for time series prediction,”Scientific reports, vol. 10, no. 1, p. 14687, 2020

  24. [24]

    Validating large-scale quantum ma- chine learning: Efficient simulation of quantum support vector machines using tensor networks,

    K.-C. Chen, T.-Y . Li, Y .-Y . Wang, S. See, C.-C. Wang, R. Wille, N.-Y . Chen, A.-C. Yang, and C.-Y . Lin, “Validating large-scale quantum ma- chine learning: Efficient simulation of quantum support vector machines using tensor networks,”Machine Learning: Science and Technology, vol. 6, no. 1, p. 015047, 2025

  25. [25]

    Quantum-train- based distributed multi-agent reinforcement learning,

    K.-C. Chen, S. Y .-C. Chen, C.-Y . Liu, and K. K. Leung, “Quantum-train- based distributed multi-agent reinforcement learning,” in2025 IEEE Symposium for Multidisciplinary Computational Intelligence Incubators (MCII Companion). IEEE, 2025, pp. 1–5

  26. [26]

    Qtrl: Toward practical quantum reinforcement learning via quantum- train,

    C.-Y . Liu, C.-H. A. Lin, C.-H. H. Yang, K.-C. Chen, and M.-H. Hsieh, “Qtrl: Toward practical quantum reinforcement learning via quantum- train,” in2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 2. IEEE, 2024, pp. 317–322

  27. [27]

    Quantum-enhanced parameter-efficient learning for typhoon trajectory forecasting,

    C.-Y . Liu, K.-C. Chen, Y .-C. Chen, S. Y .-C. Chen, W.-H. Huang, W.-J. Huang, and Y .-J. Chang, “Quantum-enhanced parameter-efficient learning for typhoon trajectory forecasting,” in2025 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1. IEEE, 2025, pp. 2046–2056

  28. [28]

    Quantum adaptive self-attention for quantum transformer models,

    C.-S. Chen and E.-J. Kuo, “Quantum adaptive self-attention for quantum transformer models,”arXiv preprint arXiv:2504.05336, 2025

  29. [29]

    Quantum convolutional neural networks,

    I. Cong, S. Choi, and M. D. Lukin, “Quantum convolutional neural networks,”Nature Physics, vol. 15, no. 12, pp. 1273–1278, 2019

  30. [30]

    Quantum agents in the gym: A variational quantum algorithm for deep Q-learning,

    A. Skolik, S. Jerbi, and V . Dunjko, “Quantum agents in the gym: A variational quantum algorithm for deep Q-learning,”Quantum, vol. 6, p. 720, 2022

  31. [31]

    Parametrized quantum policies for reinforcement learning,

    S. Jerbi, C. Gyurik, S. Marshall, H. J. Briegel, and V . Dunjko, “Parametrized quantum policies for reinforcement learning,” inAd- vances in Neural Information Processing Systems, vol. 34, 2021, pp. 28 362–28 375

  32. [32]

    Enhancing variational quantum state diagonaliza- tion using reinforcement learning techniques,

    A. Kundu, P. Bedełek, M. Ostaszewski, O. Danaci, Y . J. Patel, V . Dunjko, and J. A. Miszczak, “Enhancing variational quantum state diagonaliza- tion using reinforcement learning techniques,”New Journal of Physics, vol. 26, no. 1, p. 013034, 2024

  33. [33]

    Reinforcement learning-assisted quantum architecture search for variational quantum algorithms,

    A. Kundu, “Reinforcement learning-assisted quantum architecture search for variational quantum algorithms,”arXiv preprint arXiv:2402.13754, 2024

  34. [34]

    Policy gradients using variational quantum circuits,

    A. Sequeira, L. P. Santos, and L. S. Barbosa, “Policy gradients using variational quantum circuits,”Quantum Machine Intelligence, vol. 5, no. 1, p. 18, 2023

  35. [35]

    Vitjan Zavrtanik, Matej Kristan, and Danijel Skoˇcaj

    S. S. Li, X. Zhang, S. Zhou, H. Shu, R. Liang, H. Liu, and L. P. Garcia, “Pqlm-multilingual decentralized portable quantum language model,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5. [Online]. Available: https://doi.org/10.1109/ICASSP49357.2023.10095215

  36. [36]

    The dawn of quantum natural language processing,

    R. Di Sipio, J.-H. Huang, S. Y .-C. Chen, S. Mangini, and M. Worring, “The dawn of quantum natural language processing,” inICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 8612–8616. [Online]. Available: https://doi.org/10.1109/ICASSP43922.2022.9747675

  37. [37]

    QAOA with n·p≥ 200

    J. Stein, I. Christ, N. Kraus, M. B. Mansky, R. M ¨uller, and C. Linnhoff-Popien, “Applying qnlp to sentiment analysis in finance,” in2023 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 2. IEEE, 2023, pp. 20–25. [Online]. Available: https://doi.org/10.1109/QCE57702.2023.10178

  38. [38]

    Learning temporal data with a variational quantum recurrent neural network,

    Y . Takaki, K. Mitarai, M. Negoro, K. Fujii, and M. Kitagawa, “Learning temporal data with a variational quantum recurrent neural network,” Physical Review Research, vol. 3, no. 4, p. 043140, 2021

  39. [39]

    Recurrent quantum neural networks,

    J. Bausch, “Recurrent quantum neural networks,”Advances in neural information processing systems, vol. 33, pp. 1368–1379, 2020

  40. [40]

    Rapid training of quantum recurrent neural networks,

    M. Siemaszko, A. Buraczewski, B. Le Saux, and M. Stobi ´nska, “Rapid training of quantum recurrent neural networks,”Quantum Machine Intelligence, vol. 5, no. 2, p. 31, 2023

  41. [41]

    Quantum recurrent neural networks for sequential learning,

    Y . Li, Z. Wang, R. Han, S. Shi, J. Li, R. Shang, H. Zheng, G. Zhong, and Y . Gu, “Quantum recurrent neural networks for sequential learning,” Neural Networks, vol. 166, pp. 148–161, 2023

  42. [42]

    Density matrix emulation of quantum recurrent neural networks for multivariate time series prediction,

    J. D. Viqueira, D. Fa ´ılde, M. M. Juane, A. G ´omez, and D. Mera, “Density matrix emulation of quantum recurrent neural networks for multivariate time series prediction,”Machine Learning: Science and Technology, vol. 6, no. 1, p. 015023, 2025

  43. [43]

    Quantum kernel-based long short-term memory for climate time-series forecasting,

    Y .-C. Hsu, N.-Y . Chen, T.-Y . Li, P.-H. H. Lee, and K.-C. Chen, “Quantum kernel-based long short-term memory for climate time-series forecasting,” in2025 International Conference on Quantum Communica- tions, Networking, and Computing (QCNC). IEEE, 2025, pp. 421–426

  44. [44]

    Federated quantum-train long short-term memory for gravitational wave signal,

    C.-Y . Liu, S. Y .-C. Chen, K.-C. Chen, W.-J. Huang, and Y .-J. Chang, “Federated quantum-train long short-term memory for gravitational wave signal,” inIEEE INFOCOM 2025-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2025, pp. 1–6

  45. [45]

    Quantum-train long short- term memory: Application on flood prediction problem,

    C.-H. A. Lin, C.-Y . Liu, and K.-C. Chen, “Quantum-train long short- term memory: Application on flood prediction problem,” in2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 2. IEEE, 2024, pp. 268–273

  46. [46]

    Quantum-enhanced channel mixing in rwkv models for time series forecasting,

    C.-S. Chen and E.-J. Kuo, “Quantum-enhanced channel mixing in rwkv models for time series forecasting,”arXiv preprint arXiv:2505.13524, 2025

  47. [47]

    Learning to program variational quantum circuits with fast weights,

    S. Y .-C. Chen, “Learning to program variational quantum circuits with fast weights,” in2024 International Joint Conference on Neural Networks (IJCNN). IEEE, 2024, pp. 1–9

  48. [48]

    Long short-term memory,

    S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

  49. [49]

    Lstm-qgan: Scalable nisq generative adversarial network,

    C. Chu, A. Hastak, and F. Chen, “Lstm-qgan: Scalable nisq generative adversarial network,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5. [Online]. Available: https://doi.org/10.1109/ ICASSP49660.2025.10888847

  50. [50]

    Vitjan Zavrtanik, Matej Kristan, and Danijel Skoˇcaj

    S. Y .-C. Chen, “Quantum deep recurrent reinforcement learning,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5. [Online]. Available: https://doi.org/10.1109/ICASSP49357.2023.10096981

  51. [51]

    Efficient quantum recurrent reinforcement learning via quantum reservoir computing,

    ——, “Efficient quantum recurrent reinforcement learning via quantum reservoir computing,” inICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024, pp. 13 186–13 190. [Online]. Available: https://doi.org/10. 1109/ICASSP48485.2024.10446089

  52. [52]

    Linear- layer-enhanced quantum long short-term memory for carbon price forecasting,

    Y . Cao, X. Zhou, X. Fei, H. Zhao, W. Liu, and J. Zhao, “Linear- layer-enhanced quantum long short-term memory for carbon price forecasting,”Quantum Machine Intelligence, vol. 5, no. 2, p. 26, 2023. [Online]. Available: https://doi.org/10.1007/s42484-023-00115-2

  53. [53]

    Benchmarking quantum and classical sequential models for urban telecommunication forecasting,

    C.-S. Chen, S. Y .-C. Chen, and Y .-C. Tsai, “Benchmarking quantum and classical sequential models for urban telecommunication forecasting,” arXiv preprint arXiv:2508.04488, 2025

  54. [54]

    Quantum recurrent neural networks with encoder-decoder for time-dependent partial differential equations,

    Y . Chen, A. Khaliq, and K. M. Furati, “Quantum recurrent neural networks with encoder-decoder for time-dependent partial differential equations,”arXiv preprint arXiv:2502.13370, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2502.13370

  55. [55]

    Quantum long sort- term memory-based identification of distributed denial of service attacks,

    S. Tripathi, H. Upadhyay, and J. Soni, “Quantum long sort- term memory-based identification of distributed denial of service attacks,” in2025 IEEE 4th International Conference on AI in Cybersecurity (ICAIC). IEEE, 2025, pp. 1–8. [Online]. Available: https://doi.org/10.1109/ICAIC63015.2025.10849228

  56. [56]

    A hybrid quantum-classical machine learning approach for self-interference cancellation in full-duplex transceivers,

    M. Elsayed and O. A. Dobre, “A hybrid quantum-classical machine learning approach for self-interference cancellation in full-duplex transceivers,”IEEE Communications Letters, 2025. [Online]. Available: https://doi.org/10.1109/LCOMM.2025.3543318

  57. [57]

    Quantum lstm model for estimation of energy expenditure in human aging using wearable iot healthcare technology,

    B.-N. D. Tran, M. Fahim, B. D. McNiven, M. Guizani, H. Shin, and T. Q. Duong, “Quantum lstm model for estimation of energy expenditure in human aging using wearable iot healthcare technology,”IEEE Internet of Things Journal, 2025. [Online]. Available: https://doi.org/10.1109/JIOT.2025.3563832

  58. [58]

    The development of the variational quantum circuits architecture of the quantum long short- term memory model for thermal error compensation in the z-axis of machine tools,

    C. Chen, Y . Yang, and W. Jywe, “The development of the variational quantum circuits architecture of the quantum long short- term memory model for thermal error compensation in the z-axis of machine tools,”The International Journal of Advanced Manufacturing Technology, vol. 140, no. 1, pp. 577–593, 2025. [Online]. Available: https://doi.org/10.1007/s00170...

  59. [59]

    Wind turbine fault detection using quantum long- short term memory network,

    Z. Zhang and X. Ma, “Wind turbine fault detection using quantum long- short term memory network,” in2025 30th International Conference on Automation and Computing (ICAC), 2025, pp. 1–6. [Online]. Available: https://doi.org/10.1109/ICAC65379.2025.11196477

  60. [60]

    Toward large-scale distributed quantum long short-term memory with modular quantum computers,

    K.-C. Chen, S. Y .-C. Chen, C.-Y . Liu, and K. K. Leung, “Toward large-scale distributed quantum long short-term memory with modular quantum computers,” in2025 International Wireless Communications and Mobile Computing (IWCMC). IEEE, 2025, pp. 337–342. [Online]. Available: https://doi.org/10.1109/IWCMC65282.2025.11059527

  61. [61]

    QKAN-LSTM: Quantum- inspired Kolmogorov-Arnold long short-term memory,

    Y .-C. Hsu, J.-C. Jiang, C.-H. Lin, K.-C. Peng, N.-Y . Chen, S. Y .-C. Chen, E.-J. Kuo, and H.-S. Goan, “QKAN-LSTM: Quantum- inspired Kolmogorov-Arnold long short-term memory,” 2025. [Online]. Available: https://arxiv.org/abs/2512.05049

  62. [62]

    Liu, Y .-H

    Y .-C. Hsu, J.-C. Jiang, C.-H. Lin, W.-T. Chen, K.-C. Peng, P. Tiwari, S. Y .-C. Chen, and E.-J. Kuo, “Federated quantum kernel-based long short-term memory for human activity recognition,” in2025 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 2. IEEE, 2025, pp. 54–58. [Online]. Available: https://doi.org/10.1109/QCE65121.2...

  63. [63]

    A study on quantum reservoir recurrent models for time-constrained volatile sequence forecasting,

    A. Rosato, A. Ceschini, F. Succetti, S. Y .-C. Chen, and M. Panella, “A study on quantum reservoir recurrent models for time-constrained volatile sequence forecasting,” in2025 International Joint Conference on Neural Networks (IJCNN). IEEE, 2025, pp. 1–8. [Online]. Available: https://doi.org/10.1109/IJCNN64981.2025.11228258

  64. [64]

    Quantum variational activation functions empower Kolmogorov-Arnold networks,

    J.-C. Jiang, Y .-C. Huang, T. Chen, and H.-S. Goan, “Quantum variational activation functions empower Kolmogorov-Arnold networks,”arXiv preprint arXiv:2509.14026, 2025. [Online]. Available: https://arxiv.org/ abs/2509.14026

  65. [65]

    Learning to control fast-weight memories: An alterna- tive to dynamic recurrent networks,

    J. Schmidhuber, “Learning to control fast-weight memories: An alterna- tive to dynamic recurrent networks,”Neural Computation, vol. 4, no. 1, pp. 131–139, 1992

  66. [66]

    Reducing the ratio between learning complexity and number of time varying variables in fully recurrent nets,

    ——, “Reducing the ratio between learning complexity and number of time varying variables in fully recurrent nets,” inICANN’93: Pro- ceedings of the International Conference on Artificial Neural Networks Amsterdam, The Netherlands 13–16 September 1993 3. Springer, 1993, pp. 460–463

  67. [67]

    Linear transformers are secretly fast weight programmers,

    I. Schlag, K. Irie, and J. Schmidhuber, “Linear transformers are secretly fast weight programmers,” inInternational conference on machine learning. PMLR, 2021, pp. 9355–9366

  68. [68]

    Transformers are rnns: Fast autoregressive transformers with linear attention,

    A. Katharopoulos, A. Vyas, N. Pappas, and F. Fleuret, “Transformers are rnns: Fast autoregressive transformers with linear attention,” in International conference on machine learning. PMLR, 2020, pp. 5156– 5165

  69. [69]

    Going beyond linear transformers with recurrent fast weight programmers,

    K. Irie, I. Schlag, R. Csord ´as, and J. Schmidhuber, “Going beyond linear transformers with recurrent fast weight programmers,”Advances in neural information processing systems, vol. 34, pp. 7703–7717, 2021

  70. [70]

    Practical computational power of linear transformers and their recurrent and self-referential extensions,

    K. Irie, R. Csord ´as, and J. Schmidhuber, “Practical computational power of linear transformers and their recurrent and self-referential extensions,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 9455–9465

  71. [71]

    Programming variational quantum circuits with quantum-train agent,

    C.-Y . Liu, S. Y .-C. Chen, K.-C. Chen, W.-J. Huang, and Y .-J. Chang, “Programming variational quantum circuits with quantum-train agent,” in2025 International Conference on Quantum Communications, Net- working, and Computing (QCNC). IEEE, 2025, pp. 544–548

  72. [72]

    Learning to program quantum measurements for machine learning,

    S. Y .-C. Chen, H.-H. Tseng, H.-Y . Lin, and S. Yoo, “Learning to program quantum measurements for machine learning,” in2025 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1. IEEE, 2025, pp. 1826–1836

  73. [73]

    Parallelizing linear recurrent neural nets over sequence length,

    E. Martin and C. Cundy, “Parallelizing linear recurrent neural nets over sequence length,” inInternational Conference on Learning Representa- tions, 2018

  74. [74]

    Reservoir computing via quantum recurrent neural networks,

    S. Y .-C. Chen, D. Fry, A. Deshmukh, V . Rastunkov, and C. Stefanski, “Reservoir computing via quantum recurrent neural networks,” arXiv preprint arXiv:2211.02612, 2022. [Online]. Available: https: //doi.org/10.48550/arXiv.2211.02612

  75. [75]

    Quantum long short-term memory with differentiable architecture search,

    S. Y .-C. Chen and P. Tiwari, “Quantum long short-term memory with differentiable architecture search,” in2025 IEEE International Conference on Quantum Artificial Intelligence (QAI). IEEE, 2025, pp. 13–18. [Online]. Available: https://doi.org/10.1109/QAI63978.2025. 00010