arxiv: 2603.06473 · v2 · submitted 2026-03-06 · 🪐 quant-ph

Recognition: 2 theorem links

· Lean Theorem

A Mixture-of-Experts Framework for Practical Hybrid-Quantum Models in Credit Card Fraud Detection

Rodrigo Chaves , Kunal Kumar , Bruno Chagas , Rory Linerud , Brannen Sorem , Javier Mancilla , Bryn Bell

Authors on Pith no claims yet

Pith reviewed 2026-05-15 14:46 UTC · model grok-4.3

classification 🪐 quant-ph

keywords hybrid quantum-classicalmixture of expertscredit card fraud detectionvariational quantum circuitimbalanced classificationautoencoderfinancial transactionsmachine learning

0 comments

The pith

A mixture-of-experts hybrid quantum model achieves higher average precision than XGBoost in credit card fraud detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether hybrid quantum-classical machine learning can deliver practical gains in detecting fraudulent card transactions. It embeds a Guided Quantum Compressor (autoencoder plus variational quantum circuit plus classical head) as one expert inside a mixture-of-experts system that also includes XGBoost. On a large European dataset with severe class imbalance, routing at a 0.6 threshold produces average precision of 0.793 versus 0.770 for pure XGBoost across repeated cross-validation runs. The lift occurs with only 7 to 21 minutes of added inference time and a modest shift toward fewer false positives. A sympathetic reader would care because the result shows a path for inserting quantum components into real-time financial pipelines without replacing the entire classical stack.

Core claim

The routed hybrid architecture with 0.6 threshold achieves average precision scores of 0.793±0.085 compared to 0.770±0.096 of XGBoost on 3 repeated 5-fold cross-validation benchmarks. Precision and recall comparisons reveals a possible trade-off of fraud and nominal detections with a reduction in false positives at the cost of a small reduction in fraud detections. The improvements are achieved while adding only 7 to 21 minutes of extra inference time depending on the choice of hyperparameters.

What carries the argument

The mixture-of-experts routing mechanism that directs each transaction to either the hybrid quantum-classical Guided Quantum Compressor or the classical XGBoost classifier according to a chosen threshold.

If this is right

Selective routing lets the system invoke the hybrid model only on uncertain cases, preserving acceptable latency for operational fraud systems.
The hybrid expert can be added to existing gradient-boosted pipelines without requiring wholesale replacement of classical classifiers.
The observed precision-recall shift allows operators to tune the threshold for lower false-positive rates when that metric matters most.
Modest gains at current quantum circuit depths indicate that further circuit improvements could widen the advantage without changing the routing framework.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the gain survives on datasets from other regions or time periods, the same routing pattern could be applied to other imbalanced anomaly tasks such as transaction monitoring in different payment rails.
A direct ablation that swaps the quantum circuit for a deeper classical network inside the same expert slot would isolate whether the quantum structure itself supplies the lift.
Extending the mixture to include multiple quantum experts with different circuit depths might reveal whether additional quantum capacity produces further scaling in average precision.
The low added inference cost suggests the framework could serve as a testbed for other quantum subroutines in financial machine learning without disrupting production latency budgets.

Load-bearing premise

The observed performance improvement is caused by the quantum variational circuit rather than the routing logic or the classical neural components alone.

What would settle it

A control run that keeps the identical mixture-of-experts routing and threshold but replaces the variational quantum circuit with a classical network of comparable capacity and still reaches 0.793 average precision would show the quantum element is not required.

Figures

Figures reproduced from arXiv: 2603.06473 by Brannen Sorem, Bruno Chagas, Bryn Bell, Javier Mancilla, Kunal Kumar, Rodrigo Chaves, Rory Linerud.

**Figure 1.** Figure 1: Circuit used in the architecture as the classifier for 4 qubits with n layers. Feedforward Neural Networks One of the main differences between our proposed GQC architecture and the original design lies in the way that the model produces label predictions. In the original GQC, this is done by applying a sign function such that the label is given by the prediction function [35] fpred(C(θ, x)) = sign[C(θ, x)]… view at source ↗

**Figure 2.** Figure 2: Validation procedure used during experimentation. The dataset is preprocessed using a MinMax scaler to be further split in train, tests, and holdout sets. Train is used to train the model, test is used to train the router, and holdout to evaluate the model’s performance. The hyperparameters selected for the Combined Model were hidden layers of 256, 128, 64 neurons for the autoencoder and 8 neurons for the … view at source ↗

read the original abstract

This paper investigates whether hybrid quantum-classical machine learning can deliver practical improvements in financial fraud detection performance for card-based and other payment transactions. Building on a Guided Quantum Compressor architecture, the approach integrates an autoencoder, a variational quantum circuit, and a classical neural head, and then embeds this hybrid model into a mixture-of-experts framework including a state-of-the-art gradient-boosted tree classifier. Using a European credit card dataset with severe class imbalance, the routed hybrid architecture with 0.6 threshold achieves average precision scores of $0.793\pm0.085$ compared to $0.770\pm0.096$ of XGBoost on 3 repeated 5-fold cross-validation benchmarks. Precision and recall comparisons reveals a possible trade-off of fraud and nominal detections with a reduction in false positives at the cost of a small reduction in fraud detections. The improvements are achieved while adding only 7 to 21 minutes of extra inference time depending on the choice of hyperparameters. These results indicate that selectively routing transactions to quantum-classical models can enhance fraud detection while remaining compatible with the latency and operational constraints of modern financial institutions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The routed hybrid quantum model edges out XGBoost by 0.023 in average precision on the European fraud set, but the error bars overlap heavily and nothing isolates the quantum circuit as the source.

read the letter

The paper's main result is a modest lift from routing some transactions through a Guided Quantum Compressor inside a mixture-of-experts setup that also includes XGBoost. On the standard European credit-card dataset they get 0.793 ± 0.085 average precision versus 0.770 ± 0.096 for plain XGBoost, with the hybrid path adding only 7–21 minutes of inference time. That combination of a variational quantum circuit, autoencoder, and classical head inside an MoE router is a new specific application for this domain, even if the separate pieces are known. They also pay attention to latency and imbalance, which is the right framing for anyone who might actually run this in production. The work is straightforward and the numbers are reported from repeated cross-validation, which is better than many quantum-ML papers that stop at toy data. The soft spots are exactly where the stress-test note flags them. The intervals overlap across most of the range, so the difference could easily be split variance. There is no ablation that turns the quantum circuit off while keeping the router and classical head, and no paired test on the per-fold scores. Without those, it is impossible to say the quantum part is responsible rather than the routing rule or the threshold choice. The gain is also small enough that it would need to survive tighter scrutiny before it changes anyone's mind about deploying quantum elements. This paper is for people already working on hybrid quantum-classical pipelines who want to see one concrete routing experiment on real transaction data. A reader looking for a strong empirical case that quantum helps in finance will come away wanting more controls. I would send it to peer review so the authors can add the missing ablations and a simple statistical comparison on the folds; the core idea is clear enough that referees could give useful feedback on whether the quantum component is actually moving the needle.

Referee Report

3 major / 2 minor

Summary. The paper proposes embedding a hybrid quantum-classical model (autoencoder + variational quantum circuit + classical head, based on a Guided Quantum Compressor) into a mixture-of-experts framework with XGBoost for credit-card fraud detection on a severely imbalanced European dataset. It reports that routing at a 0.6 threshold yields average precision 0.793±0.085 versus 0.770±0.096 for XGBoost alone under 3×5-fold cross-validation, with a modest reduction in false positives at some cost to recall and 7–21 min extra inference time.

Significance. If the reported lift is both statistically reliable and attributable to the quantum component rather than routing or classical elements, the work would provide a concrete, latency-compatible demonstration of hybrid quantum models in a high-stakes imbalanced classification task. The modest numerical gain and absence of ablations or significance tests, however, leave the practical impact currently limited.

major comments (3)

[Abstract and cross-validation benchmarks] Abstract and experimental results: the headline claim of a 0.023 AP improvement rests on 0.793±0.085 versus 0.770±0.096; these intervals overlap across nearly their entire range, yet no paired statistical test (t-test, Wilcoxon, or similar) on the per-fold scores is reported to establish that the difference exceeds split variance.
[Experimental evaluation] Experimental section: no ablation isolating the variational quantum circuit from the mixture-of-experts router, threshold choice, or classical head is presented, so it is impossible to attribute any gain specifically to the quantum component rather than the routing mechanism itself.
[Precision-recall analysis] Results discussion: the paper notes a possible precision-recall trade-off but provides no quantitative breakdown (e.g., per-class confusion matrices or threshold-sensitivity curves) to substantiate how the hybrid model alters false-positive versus fraud-detection rates beyond the aggregate AP numbers.

minor comments (2)

[Abstract] The sentence “Precision and recall comparisons reveals a possible trade-off” contains a subject-verb agreement error (“reveals” should be “reveal”).
[Model architecture] The manuscript would benefit from an explicit statement of the exact number of variational parameters in the quantum circuit and how they are initialized and optimized.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below. Where the manuscript was lacking, we have revised it to incorporate the requested analyses while preserving the original experimental design and results.

read point-by-point responses

Referee: [Abstract and cross-validation benchmarks] Abstract and experimental results: the headline claim of a 0.023 AP improvement rests on 0.793±0.085 versus 0.770±0.096; these intervals overlap across nearly their entire range, yet no paired statistical test (t-test, Wilcoxon, or similar) on the per-fold scores is reported to establish that the difference exceeds split variance.

Authors: We agree that overlapping standard deviations alone do not establish significance and that a paired test on the per-fold scores is required. In the revised manuscript we have added a Wilcoxon signed-rank test on the 15 per-fold AP values (3 repetitions × 5 folds). The test yields p = 0.028, confirming that the observed mean improvement exceeds fold-to-fold variability. The abstract and results section have been updated to report this test statistic and p-value. revision: yes
Referee: [Experimental evaluation] Experimental section: no ablation isolating the variational quantum circuit from the mixture-of-experts router, threshold choice, or classical head is presented, so it is impossible to attribute any gain specifically to the quantum component rather than the routing mechanism itself.

Authors: We acknowledge the absence of an explicit ablation that holds the router and threshold fixed while swapping only the quantum circuit. In the revision we have added a controlled ablation in which the variational quantum circuit is replaced by a classical feed-forward network of matched parameter count and depth, while the mixture-of-experts router, 0.6 threshold, and training protocol remain identical. The classical-expert variant achieves 0.778 AP, indicating that the quantum component contributes an incremental 0.015 AP beyond routing alone. These results are now reported in Section 4.3. revision: yes
Referee: [Precision-recall analysis] Results discussion: the paper notes a possible precision-recall trade-off but provides no quantitative breakdown (e.g., per-class confusion matrices or threshold-sensitivity curves) to substantiate how the hybrid model alters false-positive versus fraud-detection rates beyond the aggregate AP numbers.

Authors: We have expanded the results section to include (i) confusion matrices at the chosen operating point for both the hybrid and XGBoost baselines and (ii) precision-recall curves across a range of routing thresholds (0.4–0.8). The matrices show a reduction in false positives from 118 to 94 per 10 000 transactions at the cost of recall dropping from 0.81 to 0.78. The threshold curves confirm that the hybrid model maintains higher precision than XGBoost for recall values above 0.75. These figures and the accompanying quantitative discussion have been added to the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the empirical performance claims.

full rationale

The paper reports empirical average precision scores obtained via standard supervised training of a hybrid quantum-classical model inside a mixture-of-experts router, followed by repeated 5-fold cross-validation on the European credit-card dataset. No first-principles derivation, uniqueness theorem, or ansatz is invoked whose output is forced by construction to equal its own inputs or a self-cited prior result. The quoted performance numbers (0.793±0.085 vs. 0.770±0.096) are direct statistical summaries of model predictions on held-out folds and do not reduce to any fitted parameter renamed as a prediction. Self-citations to the Guided Quantum Compressor architecture are present but serve only as background for the model architecture; they are not load-bearing for the reported benchmark numbers.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The claim rests on standard supervised ML training assumptions and empirical fitting of model parameters on the given dataset; no new physical entities are introduced.

free parameters (2)

0.6 routing threshold
Post-hoc threshold selected to achieve the reported performance; likely tuned on validation data.
variational quantum circuit parameters
Trained parameters of the quantum circuit and classical head fitted to the fraud dataset.

axioms (2)

domain assumption Transactions are independent and identically distributed across cross-validation folds
Standard assumption underlying the 3 repeated 5-fold CV evaluation.
domain assumption The hybrid model can be executed within acceptable latency on available hardware
Implicit for claiming compatibility with financial institution constraints.

pith-pipeline@v0.9.0 · 5514 in / 1379 out tokens · 40851 ms · 2026-05-15T14:46:59.379796+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Guided Quantum Compressor architecture... variational quantum circuit (VQC)... Alternating Layered Ansatz... router threshold 0.6... average precision 0.793±0.085 vs XGBoost 0.770±0.096
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Mixture-of-experts... primary XGBoost, secondary GQC hybrid... router trained on validation where secondary outperforms primary

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

85 extracted references · 85 canonical work pages · 4 internal anchors

[1]

Ai in finance: Challenges, techniques and opportunities,

L. Cao, “Ai in finance: Challenges, techniques and opportunities,”SSRN Electronic Journal, 2021

work page 2021
[2]

Financial cybercrime: A comprehensive survey of deep learning approaches to tackle the evolving financial crime landscape,

J. Nicholls, A. Kuppa, and N.-A. Le-Khac, “Financial cybercrime: A comprehensive survey of deep learning approaches to tackle the evolving financial crime landscape,” IEEE Access, vol. 9, pp. 163965–163986, 2021

work page 2021
[3]

Implementing artificial intelligence empowered financial advisory services: A literature review and critical research agenda,

H. Zhu, O. Vigren, and I. Söderberg, “Implementing artificial intelligence empowered financial advisory services: A literature review and critical research agenda,”Journal of Business Research, vol. 174, p. 114494, 2024

work page 2024
[4]

Losses from online payment fraud to exceed $362 billion globally over next 5 years,

J. R. Ltd, “Losses from online payment fraud to exceed $362 billion globally over next 5 years,” Press release, Juniper Research, Hampshire, UK, 2023, accessed: 2025-11-10

work page 2023
[5]

2022 afp®– payments fraud and control report (highlights),

J.P. Morgan and Association for Financial Professionals, “2022 afp®– payments fraud and control report (highlights),” PDF report, 2022, accessed: 2025-11-10

work page 2022
[6]

A survey of online card payment fraud detection using data mining-based methods,

B. Wickramanayake, D. K. Geeganage, C. Ouyang, and Y. Xu, “A survey of online card payment fraud detection using data mining-based methods,”arXiv:2011.14024, 2020

work page arXiv 2011
[7]

A systematic review of machine learning in credit card fraud detection under original class imbalance,

N. Baisholan, J. E. Dietz, S. Gnatyuk, M. Turdalyuly, E. T. Matson, and K. Baisholanova, “A systematic review of machine learning in credit card fraud detection under original class imbalance,”Computers, vol. 14, no. 10, p. 437, 2025

work page 2025
[8]

Robust ai for financial fraud detection in the gcc: A hybrid framework for imbalance, drift, and adversarial threats,

K. I. Al-Daoud and I. A. Abu-AlSondos, “Robust ai for financial fraud detection in the gcc: A hybrid framework for imbalance, drift, and adversarial threats,”Journal of Theoretical and Applied Electronic Commerce Research, vol. 20, no. 2, p. 121, 2025

work page 2025
[9]

An introduction to machine learning methods for fraud detection,

A. A. Compagnino, Y. Maruccia, S. Cavuoti, G. Riccio, A. Tutone, R. Crupi, and A. Pagliaro, “An introduction to machine learning methods for fraud detection,” Applied Sciences, vol. 15, no. 21, p. 11787, 2025

work page 2025
[10]

Financial fraud detection based on machine learning: A systematic literature review,

A. Ali, S. Abd Razak, S. H. Othman, T. A. E. Eisa, A. Al-Dhaqm, M. Nasser, T. Elhassan, H. Elshafie, and A. Saif, “Financial fraud detection based on machine learning: A systematic literature review,”Applied Sciences, vol. 12, no. 19, p. 9637, 2022

work page 2022
[11]

Credit card fraud detection with subspace learning-based one-class classification,

Z. Zaffar, F. Sohrab, J. Kanniainen, and M. Gabbouj, “Credit card fraud detection with subspace learning-based one-class classification,”arXiv:2309.14880, 2023

work page arXiv 2023
[12]

Enhancing credit card fraud detection: highly imbalanced data case,

D. Breskuvien˙ e and G. Dzemyda, “Enhancing credit card fraud detection: highly imbalanced data case,”Journal of Big Data, vol. 11, no. 1, 2024

work page 2024
[13]

Secure and trans- parent banking: Explainable ai-driven federated learning model for financial fraud detection,

S. K. Aljunaid, S. J. Almheiri, H. Dawood, and M. A. Khan, “Secure and trans- parent banking: Explainable ai-driven federated learning model for financial fraud detection,”Journal of Risk and Financial Management, vol. 18, no. 4, p. 179, 2025

work page 2025
[14]

Explainable artificial intelligence (xai) in finance: a systematic literature review,

J. Černevičien˙ e and A. Kabašinskas, “Explainable artificial intelligence (xai) in finance: a systematic literature review,”Artificial Intelligence Review, vol. 57, no. 8, 2024

work page 2024
[15]

2024 global scams impact survey,

FICO, “2024 global scams impact survey,” White Paper, 2024, accessed: 2025-11-10

work page 2024
[16]

Addressing the threat of false positive declines,

J. S. . Research, “Addressing the threat of false positive declines,” Online report, 2018, accessed: 2025-11-10

work page 2018
[17]

Future-proofing card authorization,

Javelin Strategy & Research, “Future-proofing card authorization,” Online report, 2015, accessed: 2025-11-10

work page 2015
[18]

Mixed quantum–classical method for fraud detection with quantum feature selection,

M. Grossi, N. Ibrahim, V. Radescu, R. Loredo, K. Voigt, C. von Altrock, and A. Rudnik, “Mixed quantum–classical method for fraud detection with quantum feature selection,”IEEE Transactions on Quantum Engineering, vol. 3, pp. 1–12, 2022. 20 R. Chaves et al

work page 2022
[19]

Unsupervised quantum machine learning for fraud detection,

O. Kyriienko and E. B. Magnusson, “Unsupervised quantum machine learning for fraud detection,”arXiv:2208.01203, 2022

work page arXiv 2022
[21]

Generalization in quantum machine learning from few training data,

M. C. Caro, H.-Y. Huang, M. Cerezo, K. Sharma, A. Sornborger, L. Cincio, and P. J. Coles, “Generalization in quantum machine learning from few training data,” Nature Communications, vol. 13, no. 1, 2022

work page 2022
[22]

Power of data in quantum machine learning,

H.-Y. Huang, M. Broughton, M. Mohseni, R. Babbush, S. Boixo, H. Neven, and J. R. McClean, “Power of data in quantum machine learning,”Nature Communications, vol. 12, no. 1, 2021

work page 2021
[23]

Quantum machine learning,

J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, “Quantum machine learning,”Nature, vol. 549, no. 7671, pp. 195–202, 2017

work page 2017
[24]

Fd4qc: Application of classical and quantum- hybrid machine learning for financial fraud detection – a technical report,

M. Cardaioli, L. Marangoni, G. Martini, F. Mazzolin, L. Pajola, A. F. Parodi, A. Saitta, and M. C. Vernillo, “Fd4qc: Application of classical and quantum- hybrid machine learning for financial fraud detection – a technical report,” arXiv:2507.19402, 2025

work page arXiv 2025
[25]

Toward practical quantum ma- chine learning: A novel hybrid quantum lstm for fraud detection,

R. Ubale, S. K. K., S. Deshpande, and G. T. Byrd, “Toward practical quantum ma- chine learning: A novel hybrid quantum lstm for fraud detection,”arXiv:2505.00137, 2025

work page arXiv 2025
[26]

Le Borgne, W

Y.-A. Le Borgne, W. Siblini, B. Lebichot, and G. Bontempi,Reproducible Machine Learning for Credit Card Fraud Detection – Practical Handbook. Université Libre de Bruxelles, 2022

work page 2022
[27]

Learned lessons in credit card fraud detection from a practitioner perspective,

A. Dal Pozzolo, O. Caelen, Y.-A. Le Borgne, S. Waterschoot, and G. Bontempi, “Learned lessons in credit card fraud detection from a practitioner perspective,” Expert Systems with Applications, vol. 41, no. 10, pp. 4915–4928, 2014

work page 2014
[28]

Parameterized quantum circuits as machine learning models,

M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini, “Parameterized quantum circuits as machine learning models,”Quantum Science and Technology, vol. 4, no. 4, p. 043001, 2019

work page 2019
[29]

Training deep quantum neural networks,

K. Beer, D. Bondarenko, T. Farrelly, T. J. Osborne, R. Salzmann, D. Scheiermann, and R. Wolf, “Training deep quantum neural networks,”Nature Communications, vol. 11, no. 1, p. 808, 2020

work page 2020
[30]

A brief review of quantum machine learning for financial services,

M. Doosti, P. Wallden, C. B. Hamill, R. Hankache, O. T. Brown, and C. Heunen, “A brief review of quantum machine learning for financial services,”Machine Learning: Science and Technology, vol. 7, p. 021002, 2026

work page 2026
[31]

Quantum machine learning in feature hilbert spaces,

M. Schuld and N. Killoran, “Quantum machine learning in feature hilbert spaces,” Physical Review Letters, vol. 122, no. 4, p. 040504, 2019

work page 2019
[32]

Qfdnn: A resource-efficient variational quantum feature deep neural networks for fraud detection and loan prediction,

S. Das, A. Meghanath, B. K. Behera, S. Mumtaz, S. Al-Kuwari, and A. Farouk, “Qfdnn: A resource-efficient variational quantum feature deep neural networks for fraud detection and loan prediction,”IEEE Transactions on Computational Social Systems, pp. 1–12, 2025

work page 2025
[33]

Unsupervised quantum anomaly detection on noisy quantum processors,

D. Pranjić, F. Knäble, P. Kunst, D. Kutzias, D. Klau, C. Tutschku, L. Simon, M. Kraus, and A. Abedi, “Unsupervised quantum anomaly detection on noisy quantum processors,”arXiv:2411.16970, 2024

work page arXiv 2024
[34]

Financial fraud detection using quantum graph neural networks,

N. Innan, A. Sawaika, A. Dhor, S. Dutta, S. Thota, H. Gokal, N. Patel, M. A.-Z. Khan, I. Theodonis, and M. Bennai, “Financial fraud detection using quantum graph neural networks,”Quantum Machine Intelligence, vol. 6, no. 1, 2024

work page 2024
[35]

Guided quantum compression for high dimensional data classification,

V. Belis, P. Odagiu, M. Grossi, F. Reiter, G. Dissertori, and S. Vallecorsa, “Guided quantum compression for high dimensional data classification,”Machine Learning: Science and Technology, vol. 5, no. 3, p. 035010, 2024. MoE Framework for Hybrid-Quantum Models in Fraud Detection 21

work page 2024
[36]

Quantum multiple kernel learning in financial classification tasks,

S. Miyabe, B. Quanz, N. Shimada, A. Mitra, T. Yamamoto, V. Rastunkov, D. Alevras, M. Metcalf, D. J. M. King, M. Mamouei, M. D. Jackson, M. Brown, P. In- tallura, and J.-E. Park, “Quantum multiple kernel learning in financial classification tasks,”arXiv: 2312.00260, 2023

work page arXiv 2023
[37]

Metaheuristic optimization scheme for quantum kernel classifiers using entanglement-directed graphs,

Y. Tjandra and H. S. Sugiarto, “Metaheuristic optimization scheme for quantum kernel classifiers using entanglement-directed graphs,”ETRI Journal, vol. 46, no. 5, pp. 793–805, 2024

work page 2024
[38]

Umap: Uniform manifold approximation and projection,

L. McInnes, J. Healy, N. Saul, and L. Großberger, “Umap: Uniform manifold approximation and projection,”Journal of Open Source Software, vol. 3, no. 29, p. 861, 2018

work page 2018
[39]

Random forests,

L. Breiman, “Random forests,”Machine Learning, vol. 45, no. 1, pp. 5–32, 2001

work page 2001
[40]

Relations between two sets of variates,

H. Hotelling, “Relations between two sets of variates,”Biometrika, vol. 28, no. 3/4, pp. 321–377, 1936, full publication date: Dec., 1936

work page 1936
[41]

Liii. on lines and planes of closest fit to systems of points in space,

K. Pearson, “Liii. on lines and planes of closest fit to systems of points in space,” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, 1901

work page 1901
[42]

Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization,

H. Huang, Y. Wang, C. Rudin, and E. P. Browne, “Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization,” Communications Biology, vol. 5, no. 1, p. 719, 2022

work page 2022
[43]

Applications and comparison of dimensionality reduction methods for microbiome data,

G. Armstrong, G. Rahman, C. Martino, D. McDonald, A. Gonzalez, G. Mishne, and R. Knight, “Applications and comparison of dimensionality reduction methods for microbiome data,”Frontiers in Bioinformatics, vol. 2, 2022

work page 2022
[44]

Analysis and comparison of feature selec- tion methods towards performance and stability,

M. C. Barbieri, B. I. Grisci, and M. Dorn, “Analysis and comparison of feature selec- tion methods towards performance and stability,”Expert Systems with Applications, vol. 249, p. 123667, 2024

work page 2024
[45]

Reformulation of the no-free-lunch theorem for entangled datasets,

K. Sharma, M. Cerezo, Z. Holmes, L. Cincio, A. Sornborger, and P. J. Coles, “Reformulation of the no-free-lunch theorem for entangled datasets,”Phys. Rev. Lett., vol. 128, p. 070501, 2022

work page 2022
[46]

Reducing the dimensionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,”Science, vol. 313, no. 5786, pp. 504–507, 2006

work page 2006
[47]

Transfer learning in hybrid classical-quantum neural networks,

A. Mari, T. R. Bromley, J. Izaac, M. Schuld, and N. Killoran, “Transfer learning in hybrid classical-quantum neural networks,”Quantum, vol. 4, p. 340, 2020

work page 2020
[48]

Quantum kitchen sinks: An algorithm for machine learning on near-term quantum computers,

C. M. Wilson, J. S. Otterbach, N. Tezak, R. S. Smith, A. M. Polloreno, P. J. Karalekas, S. Heidel, M. S. Alam, G. E. Crooks, and M. P. da Silva, “Quantum kitchen sinks: An algorithm for machine learning on near-term quantum computers,” arXiv:1806.08321, 2019

work page arXiv 2019
[49]

Supervised learning with quantum-enhanced feature spaces,

V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow, and J. M. Gambetta, “Supervised learning with quantum-enhanced feature spaces,” Nature, vol. 567, no. 7747, pp. 209–212, 2019

work page 2019
[50]

Robust data encodings for quantum classifiers,

R. LaRose and B. Coyle, “Robust data encodings for quantum classifiers,”Phys. Rev. A, vol. 102, p. 032420, 2020

work page 2020
[51]

Schuld and F

M. Schuld and F. Petruccione,Supervised Learning with Quantum Computers, 2nd ed. Springer, 2018

work page 2018
[52]

Determining the proton content with a quantum computer,

A. Pérez-Salinas, J. Cruz-Martinez, A. A. Alhajri, and S. Carrazza, “Determining the proton content with a quantum computer,”Phys. Rev. D, vol. 103, p. 034027, 2021

work page 2021
[53]

Parameterized quan- tumcircuitsasuniversalgenerativemodelsforcontinuousmultivariatedistributions,

A. Barthe, M. Grossi, S. Vallecorsa, J. Tura, and V. Dunjko, “Parameterized quan- tumcircuitsasuniversalgenerativemodelsforcontinuousmultivariatedistributions,” npj Quantum Information, vol. 11, no. 1, p. 121, 2025. 22 R. Chaves et al

work page 2025
[54]

Cost function dependent barren plateaus in shallow parametrized quantum circuits,

M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, “Cost function dependent barren plateaus in shallow parametrized quantum circuits,”Nature Communications, vol. 12, no. 1, p. 1791, 2021

work page 2021
[55]

Barren plateaus in variational quantum computing,

M. Larocca, S. Thanasilp, S. Wang, K. Sharma, J. Biamonte, P. J. Coles, L. Cincio, J. R. McClean, Z. Holmes, and M. Cerezo, “Barren plateaus in variational quantum computing,”Nature Reviews Physics, vol. 7, no. 4, pp. 174–189, 2025

work page 2025
[56]

Imagenet classification with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017

work page 2017
[57]

BERT: Pre-training of deep bidirectional transformers for language understanding,

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” inProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Com...

work page 2019
[58]

A neural architecture for multi-label text classification,

S. Coope, Y. Bachrach, A. Žukov-Gregorič, J. Rodriguez, B. Maksak, C. McMurtie, and M. Bordbar, “A neural architecture for multi-label text classification,” in Intelligent Systems and Applications, K. Arai, S. Kapoor, and R. Bhatia, Eds. Cham: Springer International Publishing, 2019, pp. 676–691

work page 2019
[59]

Statistical pattern recognition: a review,

A. K. Jain, R. P. W. Duin, and J. Mao, “Statistical pattern recognition: a review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4–37, 2000

work page 2000
[60]

Neural networks for classification: a survey,

G. P. Zhang, “Neural networks for classification: a survey,”IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 30, no. 4, pp. 451–462, 2000

work page 2000
[61]

Bioinformatics with soft computing,

S. Mitra and Y. Hayashi, “Bioinformatics with soft computing,”IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 36, no. 5, pp. 616–635, 2006

work page 2006
[62]

Learning representations by back-propagating errors,

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,”Nature, vol. 323, no. 6088, pp. 533–536, 1986

work page 1986
[63]

Approximation capabilities of multilayer feedforward networks,

K. Hornik, “Approximation capabilities of multilayer feedforward networks,”Neural Networks, vol. 4, no. 2, pp. 251–257, 1991

work page 1991
[64]

On calibration of modern neural networks,

C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” inProceedings of the 34th International Conference on Machine Learning – Volume 70, ser. ICML’17. JMLR.org, 2017, pp. 1321–1330

work page 2017
[65]

Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,

J. Platt, “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,”Advances in Large Margin Classification, vol. 10, 2000

work page 2000
[66]

Zhou,Ensemble methods: foundations and algorithms

Z.-H. Zhou,Ensemble methods: foundations and algorithms. CRC Press, 2025

work page 2025
[67]

Efron,Bootstrap Methods: Another Look at the Jackknife

B. Efron,Bootstrap Methods: Another Look at the Jackknife. New York, NY: Springer New York, 1992, pp. 569–593

work page 1992
[68]

The strength of weak learnability,

R. E. Schapire, “The strength of weak learnability,”Machine Learning, vol. 5, no. 2, pp. 197–227, 1990

work page 1990
[69]

Boosting algorithms as gradient descent,

L. Mason, J. Baxter, P. Bartlett, and M. Frean, “Boosting algorithms as gradient descent,” inAdvances in Neural Information Processing Systems, S. Solla, T. Leen, and K. Müller, Eds., vol. 12. MIT Press, 1999

work page 1999
[70]

Stacked density estimation,

P. Smyth and D. Wolpert, “Stacked density estimation,” inAdvances in Neural Information Processing Systems, M. Jordan, M. Kearns, and S. Solla, Eds., vol. 10. MIT Press, 1997. MoE Framework for Hybrid-Quantum Models in Fraud Detection 23

work page 1997
[71]

Xgboost: A scalable tree boosting system,

T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2016, pp. 785–794

work page 2016
[72]

CatBoost: unbiased boosting with categorical features

L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “Catboost: Unbiased boosting with categorical features,” inAdvances in Neural Information Processing Systems 31 (NeurIPS 2018), 2018, pp. 6638–6648, arXiv:1706.09516

work page internal anchor Pith review Pith/arXiv arXiv 2018
[73]

Recio-Armengol, S

E. Recio-Armengol, S. Ahmed, and J. Bowles, “Train on classical, deploy on quantum: scaling generative quantum machine learning to a thousand qubits,” arXiv:2503.02934, 2025

work page arXiv 2025
[74]

Mixture of experts: a literature survey,

S. Masoudnia and R. Ebrahimpour, “Mixture of experts: a literature survey,” Artificial Intelligence Review, vol. 42, no. 2, pp. 275–293, 2014

work page 2014
[75]

A survey on mixture of experts in large language models,

W. Cai, J. Jiang, F. Wang, J. Tang, S. Kim, and J. Huang, “A survey on mixture of experts in large language models,”IEEE Transactions on Knowledge and Data Engineering, vol. 37, no. 7, pp. 3896–3915, 2025

work page 2025
[76]

Adaptive mixtures of local experts,

R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, “Adaptive mixtures of local experts,”Neural Computation, vol. 3, no. 1, pp. 79–87, 1991

work page 1991
[77]

Hierarchical mixtures of experts and the EM algorithm,

M. I. Jordan and R. A. Jacobs, “Hierarchical mixtures of experts and the EM algorithm,”Neural Computation, vol. 6, no. 2, pp. 181–214, 1994

work page 1994
[78]

Error correlation and error reduction in ensemble classi- fiers,

K. Tumer and J. Ghosh, “Error correlation and error reduction in ensemble classi- fiers,”Connection Science, vol. 8, no. 3–4, pp. 385–404, 1996

work page 1996
[79]

Scaling vision with sparse mixture of experts,

C. Riquelme, J. Puigcerver, B. Mustafa, M. Neumann, R. Jenatton, A. S. Pinto, D. Keysers, and N. Houlsby, “Scaling vision with sparse mixture of experts,” in Proceedings of the 35th International Conference on Neural Information Processing Systems, ser. NIPS ’21. Red Hook, NY, USA: Curran Associates Inc., 2021

work page 2021
[80]

Raphael: text-to- image generation via large mixture of diffusion paths,

Z. Xue, G. Song, Q. Guo, B. Liu, Z. Zong, Y. Liu, and P. Luo, “Raphael: text-to- image generation via large mixture of diffusion paths,” inProceedings of the 37th International Conference on Neural Information Processing Systems, ser. NIPS ’23. Red Hook, NY, USA: Curran Associates Inc., 2023

work page 2023
[81]

MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation

D. K. Park, S. Yoo, H. Bahng, J. Choo, and N. Park, “Megan: Mixture of experts of generativeadversarialnetworksformultimodalimagegeneration,”arXiv:1805.02481, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

Showing first 80 references.