Asymmetry PRISM: A CPU/GPU Portfolio Optimization Engine for Deadline-Bounded Institutional Rebalancing
Pith reviewed 2026-06-26 05:48 UTC · model grok-4.3
The pith
Asymmetry PRISM completes 500 institutional rebalances over a 10,000-instrument universe in 109.5 seconds on GPU while meeting a 25-minute deadline.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Asymmetry PRISM is a portfolio optimization engine that on completed multi-solver rows from N=100 to N=2,000 is 4.5x to 24.1x faster than the fastest completed reference row in the same lane; on a production queue of 500 accounts over a 10,000-instrument universe the GPU route finishes all 500 solves in 109.5 seconds inside a declared 25-minute operating window with an audit record for every solve while the recorded OSQP baseline finishes only 4 of 500; on an operationally constrained real-data suite the engine clears constrained solves 3.4x to 126.7x faster than the best completing incumbent at certified-equal objectives and the GPU route widens to 8.8x over the CPU route at N=384,800.
What carries the argument
Asymmetry PRISM, a CPU/GPU portfolio optimization engine that ingests problem data and returns weights, status codes, timings, memory class, feasibility diagnostics, and audit records for batched institutional rebalancing.
If this is right
- Institutions can process hundreds of accounts with full constraint sets inside fixed operating windows without missed deadlines.
- The GPU route supplies an 8.8x additional speedup over the CPU route at the largest tested scale of N=384,800.
- Every solve produces a complete audit record of feasibility, timing, memory, and failure status.
- Speedups of 3.4x to 126.7x over the best completing incumbent are achieved at certified-equal objective values on tax-motivated transition penalties and restriction caps.
Where Pith is reading between the lines
- The design could support more frequent rebalancing cycles without lengthening the operating window.
- The same engine structure might transfer to other finance workloads that combine batch quadratic programs with hard deadlines, such as intraday risk hedging.
- At still larger account counts the memory-class and parallel scaling behavior reported for N=384,800 would determine the practical ceiling.
Load-bearing premise
The reference solvers including OSQP represent the relevant state-of-the-art baselines and the chosen problem instances with tax penalties, turnover controls, and restriction caps are representative of real institutional workloads.
What would settle it
Running the same 500-account production queue with the 10,000-instrument universe on identical hardware and observing whether any standard solver such as OSQP completes more than 4 accounts inside the 25-minute window.
read the original abstract
Institutional rebalancing is a batched optimization workload with a hard operating deadline: hundreds of accounts need new weights under budget, turnover, exposure, exclusion, and tax-aware controls before trading can proceed. This paper evaluates Asymmetry PRISM, a CPU/GPU portfolio optimization engine, through a public evaluation boundary; problem data in, and returned weights, status codes, timings, memory class, external feasibility diagnostics, eligible objective comparisons, and audit records out. Within that boundary, the evaluation protocol fixes hardware and software versions, declares timing lanes, separates cold single calls from repeated workloads, and admits objective-gap claims only where an eligible reference solver completed. On completed multi-solver rows from N=100 to N=2,000, Asymmetry PRISM-CPU is 4.5x to 24.1x faster than the fastest completed reference row in the same lane. In the production queue study, Asymmetry PRISM-GPU completes 500/500 accounts over a 10,000-instrument universe in 109.5 s within a declared 25-minute operating window, with zero missed deadlines and an audit record for every solve; the recorded OSQP queue baseline completes 4/500. On an operationally constrained real-data suite (tax-motivated transition penalties, restriction caps, turnover controls, batches), Asymmetry PRISM clears constrained solves 3.4x to 126.7x faster than the best completing incumbent at certified-equal objectives, and the GPU route widens to 8.8x over the CPU route at N=384,800. Rows without a completed reference are reported as feasibility, timing, memory, and failure-status evidence.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Asymmetry PRISM, a CPU/GPU portfolio optimization engine for institutional rebalancing under hard deadlines and constraints including tax penalties, turnover controls, exposure limits, and restrictions. It reports empirical timing and completion results on batched solves for N=100 to 2000 instruments and a production queue of 500 accounts over a 10,000-instrument universe, claiming 4.5x–24.1x speedups versus the fastest completed reference solver (including OSQP) on multi-solver rows, 500/500 completions in 109.5 s (versus 4/500 for the OSQP baseline) within a 25-minute window, and up to 126.7x faster clears at certified-equal objectives, with all results conditioned on a declared public evaluation boundary that admits objective comparisons only where a reference completed.
Significance. If the evaluation protocol, baseline configurations, and instance representativeness hold, the work would demonstrate practical feasibility for deadline-bounded, tax-aware rebalancing at institutional scale on commodity CPU/GPU hardware, with the emphasis on audit records, feasibility diagnostics, and separate reporting of non-completed rows providing a useful template for reproducible systems evaluation in computational finance.
major comments (3)
- [Abstract] Abstract: The headline claims of 4.5x–24.1x speedups and 500/500 vs. 4/500 completions are restricted to “completed multi-solver rows” and “eligible reference solver completed,” yet the manuscript provides no count or characterization of excluded rows, nor any analysis of whether those rows correspond to the operationally hardest instances; this selection criterion is load-bearing for the generalization of the performance advantage.
- [Abstract] Abstract (production queue study): The comparison of Asymmetry PRISM-GPU completing all 500 accounts versus the OSQP queue baseline completing only 4/500 lacks any description of the reference solver’s configuration (tolerances, iteration limits, warm-starting), tuning effort, or resource allocation, making it impossible to verify that the baseline represents a fair or state-of-the-art comparator for the tax-penalty/turnover/restriction instances.
- [Abstract] Abstract: The evaluation protocol is described only at the level of “fixes hardware and software versions, declares timing lanes, separates cold single calls,” with no methods section, data-generation procedure, or verification details supplied; this absence directly undermines the soundness of all reported timings, memory classes, and “certified-equal objectives” assertions.
minor comments (1)
- The abstract is information-dense; separating the CPU versus GPU results and the single-call versus repeated-workload lanes into distinct sentences would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on the abstract and evaluation protocol. We address each major comment below and will revise the manuscript accordingly to improve transparency.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline claims of 4.5x–24.1x speedups and 500/500 vs. 4/500 completions are restricted to “completed multi-solver rows” and “eligible reference solver completed,” yet the manuscript provides no count or characterization of excluded rows, nor any analysis of whether those rows correspond to the operationally hardest instances; this selection criterion is load-bearing for the generalization of the performance advantage.
Authors: The manuscript conditions all speedup and completion claims on completed multi-solver rows and separately reports non-completed rows with feasibility, timing, memory, and failure-status evidence. We agree the abstract would be strengthened by explicit counts and characterization of excluded rows. We will revise the abstract and add a table summarizing the fraction and properties of completed versus non-completed instances to allow readers to evaluate selection effects. revision: yes
-
Referee: [Abstract] Abstract (production queue study): The comparison of Asymmetry PRISM-GPU completing all 500 accounts versus the OSQP queue baseline completing only 4/500 lacks any description of the reference solver’s configuration (tolerances, iteration limits, warm-starting), tuning effort, or resource allocation, making it impossible to verify that the baseline represents a fair or state-of-the-art comparator for the tax-penalty/turnover/restriction instances.
Authors: The abstract omits these configuration details. While the evaluation protocol section of the manuscript specifies reference solver settings, we will expand the abstract with a concise description of the OSQP configuration parameters (tolerances, iteration limits, warm-starting), tuning effort, and resource allocation to enable independent verification of the baseline. revision: yes
-
Referee: [Abstract] Abstract: The evaluation protocol is described only at the level of “fixes hardware and software versions, declares timing lanes, separates cold single calls,” with no methods section, data-generation procedure, or verification details supplied; this absence directly undermines the soundness of all reported timings, memory classes, and “certified-equal objectives” assertions.
Authors: The abstract summarizes the protocol at a high level. We agree a dedicated methods description is required. We will add a Methods section detailing the data-generation procedure, verification steps, hardware/software versions, timing lane definitions, and criteria for certified-equal objectives to support the reported timings and claims. revision: yes
Circularity Check
No circularity; purely empirical timing benchmarks with no derivation chain
full rationale
The paper contains no mathematical derivations, first-principles results, fitted parameters, or ansatzes. All claims consist of direct wall-clock timing measurements on fixed problem instances against external reference solvers (OSQP and others). The evaluation protocol explicitly conditions objective-gap claims on completed reference solves and reports non-completed rows separately as feasibility/timing evidence. No self-citation is used to justify any core claim, and no result reduces to its own inputs by construction. This is a standard empirical performance study whose validity rests on the representativeness of the test suite and baselines rather than any definitional or self-referential loop.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Artifact review and badging, version 1.1
ACM. Artifact review and badging, version 1.1. ACM Publications Policy, 2020
2020
-
[2]
Optimal execution of portfolio transactions.Journal of Risk, 3(2):5–39, 2001
Robert Almgren and Neil Chriss. Optimal execution of portfolio transactions.Journal of Risk, 3(2):5–39, 2001. doi: 10.21314/JOR.2001.041
-
[3]
Antoine Bambade, Fabian Schramm, Sarah El-Kazdadi, Stéphane Caron, Adrien B. Taylor, and Justin Carpentier. ProxQP: An efficient and versatile quadratic programming solverfor real-time robotics applications and beyond.IEEE Transactions on Robotics, 2025. doi: 10.1109/TRO.2025.3577107
-
[4]
Thomas Bartz-Beielstein, Carola Doerr, Daan van den Berg, Jakob Bossek, Sowmya Chandrasekaran, Tome Eftimov, Andreas Fischbach, Pascal Kerschke, William La Cava, Manuel López-Ibáñez, Katherine M. Malan, Jason H. Moore, BorisNaujoks,PatrykOrzechowski,VanessaVolz,MarkusWagner,andThomasWeise. Benchmarkinginoptimization: Best practice and open issues. arXiv p...
arXiv 2007
-
[5]
Global portfolio optimization.Financial Analysts Journal, 48(5):28–43, 1992
Fischer Black and Robert Litterman. Global portfolio optimization.Financial Analysts Journal, 48(5):28–43, 1992. doi: 10.2469/faj.v48.n5.28
-
[6]
Cambridge University Press, 2004
Stephen Boyd and Lieven Vandenberghe.Convex Optimization. Cambridge University Press, 2004. doi: 10.1017/ CBO9780511804441
2004
-
[7]
Shomesh E. Chaudhuri, Terence C. Burnham, and Andrew W. Lo. An empirical evaluation of tax-loss-harvesting alpha.Financial Analysts Journal, 76(3):99–108, 2020. doi: 10.1080/0015198X.2020.1760064
-
[8]
George M. Constantinides. Capital market equilibrium with personal tax.Econometrica, 51(3):611–636, 1983. doi: 10.2307/1912150
-
[9]
CVXPY: A Python-embedded modeling language for convex optimization
Steven Diamond and Stephen Boyd. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1–5, 2016
2016
-
[10]
Benchmarkingoptimizationsoftware withperformanceprofiles.Math.Program.,91(2):201–213,2002
Elizabeth D. Dolan and Jorge J. Moré. Benchmarking optimization software with performance profiles.Mathematical Programming, 91(2):201–213, 2002. doi: 10.1007/s101070100263
-
[11]
Commission delegated regulation (EU) 2017/589: Regulatory technical standards specifying the organisational requirements of investment firms engaged in algorithmic trading
European Commission. Commission delegated regulation (EU) 2017/589: Regulatory technical standards specifying the organisational requirements of investment firms engaged in algorithmic trading. Official Journal of the European Union, L 87, 2017. MiFID II RTS 6
2017
-
[12]
Eugene F. Fama and Kenneth R. French. Common risk factors in the returns on stocks and bonds.Journal of Financial Economics, 33(1):3–56, 1993. doi: 10.1016/0304-405X(93)90023-5
-
[13]
Medeiros, Hanming Yang, and Songshan Yang
Qingliang Fan, Marcelo C. Medeiros, Hanming Yang, and Songshan Yang. Cost-aware portfolios in a large universe of assets. arXiv preprint arXiv:2412.11575, 2025. URLhttps://arxiv.org/abs/2412.11575
arXiv 2025
-
[14]
Paul J. Goulart and Yuwen Chen. Clarabel: An interior-point solver for conic programs with quadratic objectives. arXiv preprint arXiv:2405.12762, 2024
arXiv 2024
-
[15]
Harvey, Michele G
Campbell R. Harvey, Michele G. Mazzoleni, and Alessandro Melone. The unintended consequences of rebalancing. Working Paper 33554, National Bureau of Economic Research, 2025
2025
-
[16]
QiHuangfuandJ.A.JulianHall. Parallelizingthedualrevisedsimplexmethod.MathematicalProgrammingComputation, 10(1):119–142, 2018. doi: 10.1007/s12532-017-0130-5
-
[17]
Principlesfordirectelectronicaccesstomarkets: Finalreport
IOSCOTechnicalCommittee. Principlesfordirectelectronicaccesstomarkets: Finalreport. InternationalOrganization 21 PRISM: deadline-bounded portfolio optimization Ghosh, 2026 of Securities Commissions, FR08/10, 2010
2026
-
[18]
FlashFolio: A GPU-accelerated solver for portfolio optimization
Yilun Jiang, Haihao Lu, Zedong Peng, and Jinwen Yang. FlashFolio: A GPU-accelerated solver for portfolio optimization. arXiv preprint arXiv:2604.22625, 2026
Pith/arXiv arXiv 2026
-
[19]
Personalized indexing: A portfolio construction plan
Kevin Khang, Alan Cummings, Thomas Paradise, and Brennan O’Connor. Personalized indexing: A portfolio construction plan. Vanguard Research, 2022
2022
-
[20]
Olivier Ledoit and Michael Wolf. A well-conditioned estimator for large-dimensional covariance matrices.Journal of Multivariate Analysis, 88(2):365–411, 2004. doi: 10.1016/S0047-259X(03)00096-4
-
[21]
Olivier Ledoit and Michael Wolf. Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks.Review of Financial Studies, 30(12):4349–4388, 2017. doi: 10.1093/rfs/hhx052
-
[22]
Applications of second-order cone programming.Linear Algebra and its Applications, 284:193–228, 1998
Miguel Sousa Lobo, Lieven Vandenberghe, Stephen Boyd, and Hervé Lebret. Applications of second-order cone programming.Linear Algebra and its Applications, 284:193–228, 1998. doi: 10.1016/S0024-3795(98)10032-0
-
[23]
Portfolio selection.Journal of Finance, 7(1):77–91, 1952
Harry Markowitz. Portfolio selection.Journal of Finance, 7(1):77–91, 1952. doi: 10.2307/2975974
-
[24]
Richard O. Michaud. The Markowitz optimization enigma: Is ‘optimized’ optimal?Financial Analysts Journal, 45(1): 31–42, 1989. doi: 10.2469/faj.v45.n1.31
-
[25]
NicholasMoehle,MykelJ.Kochenderfer,StephenBoyd,andAndrewAng. Tax-awareportfolioconstructionviaconvex optimization.Journal of Optimization Theory and Applications, 189:364–383, 2021. doi: 10.1007/s10957-021-01823-0
-
[26]
NicholasMoehle,JacobGindi,StephenBoyd,andMykelJ.Kochenderfer. Portfolioconstructionaslinearlyconstrained separable optimization.Optimization and Engineering, 24:1667–1687, 2023. doi: 10.2139/ssrn.3800965
-
[27]
MOSEK ApS, 2026
MOSEK ApS.MOSEK Optimizer API Manual. MOSEK ApS, 2026. Version 11.1
2026
-
[28]
Nasdaq closing cross: Frequently asked questions
Nasdaq. Nasdaq closing cross: Frequently asked questions. Nasdaq Trader market-system documentation,https: //www.nasdaqtrader.com/content/productsservices/Trading/ClosingCrossfaq.pdf, 2024
2024
-
[29]
Yi-Shuai Niu and Yajuan Wang. Scalable mean-variance portfolio optimization via subspace embeddings and GPU-friendly nesterov-accelerated projected gradient. arXiv preprint arXiv:2604.02917, 2026. URLhttps://arxiv. org/abs/2604.02917
Pith/arXiv arXiv 2026
-
[30]
NVIDIA cuOpt documentation.https://docs.nvidia.com/cuopt/, 2026
NVIDIA Corporation. NVIDIA cuOpt documentation.https://docs.nvidia.com/cuopt/, 2026. Version 26.2
2026
-
[31]
Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding.Journal of Optimization Theory and Applications, 169(3):1042–1068, 2016. doi: 10.1007/s10957-016-0892-3
-
[32]
André F. Perold. The implementation shortfall: Paper versus reality.Journal of Portfolio Management, 14(3):4–9, 1988. doi: 10.3905/jpm.1988.409150
-
[33]
GPUaccelerationofADMMforlarge-scalequadraticprogramming
MichelSchubiger,GoranBanjac,andJohnLygeros. GPUaccelerationofADMMforlarge-scalequadraticprogramming. Journal of Parallel and Distributed Computing, 144:55–67, 2020. doi: 10.1016/j.jpdc.2020.05.021
-
[34]
William F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions of risk.Journal of Finance, 19 (3):425–442, 1964. doi: 10.1111/j.1540-6261.1964.tb02865.x
-
[35]
Enhancingactivetaxmanagementthroughtherealization of capital gains.Journal of Wealth Management, 10(4):9–16, 2008
DavidM.Stein,HemambaraVadlamudi,andPaulBouchey. Enhancingactivetaxmanagementthroughtherealization of capital gains.Journal of Wealth Management, 10(4):9–16, 2008
2008
-
[36]
OSQP: An operator splitting solver for quadratic programs.Mathematical Programming Computation, 12(4):637–672, 2020
Bartolomeo Stellato, Goran Banjac, Paul Goulart, Alberto Bemporad, and Stephen Boyd. OSQP: An operator splitting solver for quadratic programs.Mathematical Programming Computation, 12(4):637–672, 2020. doi: 10.1007/ s12532-020-00179-2
2020
-
[37]
Securities and Exchange Commission
U.S. Securities and Exchange Commission. Risk management controls for brokers or dealers with market access. 17 CFR 240.15c3-5; Exchange Act Release No. 34-63241, 2010
2010
-
[38]
Using anytime algorithms in intelligent systems.AI Magazine, 17(3):73–83, 1996
Shlomo Zilberstein. Using anytime algorithms in intelligent systems.AI Magazine, 17(3):73–83, 1996. doi: 10.1609/ aimag.v17i3.1232. 22
1996
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.