pith. sign in

arxiv: 2605.31005 · v1 · pith:VTYRWWZZnew · submitted 2026-05-29 · 💻 cs.LG

Learning Multi-Agent Coordination via Sheaf-ADMM

Pith reviewed 2026-06-28 23:19 UTC · model grok-4.3

classification 💻 cs.LG
keywords multi-agent coordinationADMMcellular sheavesdifferentiable optimizationlocal viewsSudokuMNIST robustnessconsensus constraints
0
0 comments X

The pith

Multi-agent systems learn coordination by solving neural-parameterized convex subproblems linked through cellular sheaf constraints in unrolled ADMM.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a differentiable optimization approach in which an input is split into overlapping local views, each handled by an agent that solves a convex subproblem encoded by a neural network. These agents reach agreement by running ADMM iterations whose constraints are defined by a cellular sheaf specifying which solution components must match between neighbors. The optimization is unrolled so that backpropagation trains the encoders and the sheaf structure jointly. Evaluations on maze pathfinding, MNIST classification, and Sudoku show that agents produce correct global outputs despite incomplete local information, with the MNIST case exhibiting greater robustness to distribution shifts than a standard CNN and the Sudoku case showing higher solve rates than matched MPNN models. The exposed ADMM state variables also permit direct inspection of the coordination process.

Core claim

Decomposing an input into local views processed by agents solving convex subproblems and coordinating them via ADMM under cellular-sheaf agreement constraints allows the system to learn correct global outputs from partial information, yielding improved robustness to distribution shifts on MNIST relative to a standard CNN and markedly higher solve rates on Sudoku than parameter-matched MPNN baselines.

What carries the argument

Cellular sheaf constraints inside the ADMM iterations, which encode heterogeneous inter-agent agreement requirements and allow the full pipeline of neural encoders plus optimization steps to be trained end-to-end by differentiation through the unrolled solver.

If this is right

  • The exposed primal, consensus, and dual variables make coordination dynamics directly observable and intervenable.
  • The framework supports tasks in which different agents must agree on different aspects of the solution.
  • End-to-end training through the unrolled solver produces systems that outperform standard message-passing networks on constraint-satisfaction and robustness benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dual variables could be monitored at inference time to detect when coordination is failing.
  • Automatic learning of the sheaf structure itself would reduce reliance on hand-specified constraints.
  • The same decomposition into local convex subproblems might apply to other distributed decision problems whose consistency requirements can be expressed as sheaves.

Load-bearing premise

Cellular sheaves can be specified or learned that exactly capture the agreement constraints needed for the task, and the neural-parameterized subproblems stay convex enough for ADMM to converge reliably.

What would settle it

If the Sudoku solve rates achieved by the sheaf-ADMM agents are not markedly higher than those of parameter-matched MPNN baselines, or if robustness to distribution shifts on MNIST does not exceed that of a standard CNN.

Figures

Figures reproduced from arXiv: 2605.31005 by Bart{\l}omiej Cupia{\l}, Jeffrey Seely, Llion Jones.

Figure 1
Figure 1. Figure 1: Each agent (circle) holds latent variables xi, zi, ui ∈ R dv (corresponding to primal, consensus, and dual variables, re￾spectively), and selectively communicates with neighbors via lin￾ear maps Fdirection ∈ R de×dv to reach consensus on shared edge spaces (ellipses). Sheaf-ADMM learns to coordinate such systems to solve tasks like maze pathfinding that require collective coordi￾nation. each agent observes… view at source ↗
Figure 2
Figure 2. Figure 2: The Sheaf-ADMM Architecture for the maze task. Input is decomposed into local patches and processed by a shared encoder to produce optimization parameters (e.g., Qi, qi). These parameterize the ADMM layer, unrolled for K iterations. Agents alternate between local optimization (x-update) and global coordination via sheaf diffusion (z-update), while dual variables u track disagreements. A decoder generates l… view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of intermediate predictions by Sheaf-ADMM on benchmark tasks. Top: MNIST—red/green pixels indicate the incorrect/correct predictions for each agent. Middle: Maze—blue cells indicate the predicted path. Bottom: Sudoku—bold digits represent initial givens; red highlights indicate cells currently violating Sudoku constraints; blue shading indicates updates from the previous timestep. 4.3. Local … view at source ↗
Figure 4
Figure 4. Figure 4: Coordination dynamics on a 2× out-of-distribution maze. Top row: aggregated path prediction at iterations k ∈ {1, 3, 5, 10, 30}. Middle rows: per-agent primal residual ∥xi −zi∥ and per-iteration dual residual ρ∥∆zi∥, each normalized as a fraction of the across-agent total per iteration. Bottom row: their log-ratio. Right: mean and max across agents of the same two quantities over iterations k. At late iter… view at source ↗
Figure 5
Figure 5. Figure 5: Class emergence. Pixel = argmax class across overlap￾ping agents; intensity = confidence. 2014), and others (Yang et al., 2022). 6.2. Relation to Differentiable Optimization Layers and “Distributed OptNet” Differentiable optimization layers embed a convex program inside a network and backpropagate through its solution, ei￾ther via implicit differentiation of the KKT conditions or by unrolling an iterative … view at source ↗
Figure 6
Figure 6. Figure 6: Size generalization in maze pathfinding. Train: the model is trained only on 19 × 19 mazes. Qualitative: aggregated path-belief heatmaps on a 37 × 37 test maze as a function of the number of ADMM iterations K, illustrating how additional coordination steps sharpen a globally consistent shortest-path prediction. Quantitative: solved rate (left) and mean iterations-to-solve on solved instances (right) over m… view at source ↗
Figure 7
Figure 7. Figure 7: Local views and initial predictions. A visualization of the inputs and initial states for all 81 agents. The colored overlay indicates the agent’s initial prediction confidence prior to any communication. The path predictions are fragmented and lack global connectivity. F.1. Local Views and Initial Predictions To ground the analysis of the coordination dynamics, we first visualize ( [PITH_FULL_IMAGE:figur… view at source ↗
Figure 8
Figure 8. Figure 8: Visualizing coordination mechanisms. The “tug-of-war” in optimization space. Each subplot in the grid corresponds to the agent at that spatial position. Blue lines track the local decision variable x k i , red lines the consensus variable z k i , and stars mark the final converged state. Across all 81 agents, the trajectories reveal the negotiation between local preferences and global consensus: agents at … view at source ↗
read the original abstract

We present a differentiable optimization framework for multi-agent coordination. An input is decomposed into overlapping local views, each processed by an agent that solves a convex subproblem parameterized by a neural encoder. Agents coordinate through the Alternating Direction Method of Multipliers (ADMM) with inter-agent constraints specified by a cellular sheaf. The sheaf specifies which aspects of neighboring solutions must agree, allowing for heterogeneous notions of global consensus. Backpropagating through the unrolled optimization jointly trains all components of the multi-agent system. We evaluate on maze pathfinding, image classification, and Sudoku, where agents with individually insufficient local views learn to coordinate to produce correct global outputs. On MNIST, the local-view decomposition yields improved robustness to distribution shifts relative to a standard CNN. On Sudoku, the optimization-derived structure yields markedly higher solve rates than parameter-matched MPNN baselines. Finally, the ADMM structure exposes distinct primal, consensus, and dual state variables, opening the coordination dynamics to direct analysis and intervention -- a property unavailable in standard message-passing architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Sheaf-ADMM, a differentiable multi-agent coordination framework in which an input is decomposed into overlapping local views. Each agent solves a convex subproblem whose objective and constraints are parameterized by a neural encoder; inter-agent agreement constraints are encoded by a cellular sheaf and enforced via ADMM. The entire system is trained end-to-end by back-propagation through unrolled ADMM iterations. Empirical results are reported on maze pathfinding, MNIST classification under distribution shift, and Sudoku, with claims that the optimization-derived coordination yields improved robustness and higher solve rates than parameter-matched MPNN baselines, while also exposing interpretable primal, consensus, and dual states.

Significance. If the convexity and convergence claims hold, the framework would offer a principled way to embed optimization structure into multi-agent neural systems, enabling heterogeneous consensus via sheaves and direct inspection of coordination dynamics unavailable in standard message-passing models. The explicit separation of primal, consensus, and dual variables is a concrete strength that could support future analysis and intervention work.

major comments (2)
  1. [Abstract] Abstract: the statement that each agent 'solves a convex subproblem parameterized by a neural encoder' is load-bearing for the claimed ADMM convergence and for attributing performance gains to 'optimization-derived structure.' No mechanism (e.g., quadratic forms, PSD constraints, or restricted activations) is described that would guarantee convexity once the encoder weights are free parameters; standard ADMM theory does not apply to arbitrary non-convex objectives.
  2. [Abstract] Abstract and evaluation sections: the reported MNIST robustness and Sudoku solve-rate improvements are presented as consequences of the sheaf-ADMM coordination. Without verification that the unrolled iterations reach stationary points under the learned neural parameterization, these gains cannot be confidently distinguished from generic message-passing effects.
minor comments (2)
  1. The manuscript should report error bars, data splits, and ablation controls for all three tasks to allow verification of the empirical claims.
  2. Notation for the cellular sheaf and the precise form of the ADMM updates should be introduced with explicit equations rather than high-level description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the convexity assumptions and the need to verify convergence of the unrolled iterations. We address each point below and will revise the manuscript accordingly to strengthen the theoretical grounding and empirical support.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the statement that each agent 'solves a convex subproblem parameterized by a neural encoder' is load-bearing for the claimed ADMM convergence and for attributing performance gains to 'optimization-derived structure.' No mechanism (e.g., quadratic forms, PSD constraints, or restricted activations) is described that would guarantee convexity once the encoder weights are free parameters; standard ADMM theory does not apply to arbitrary non-convex objectives.

    Authors: We agree that the manuscript does not describe an explicit mechanism to guarantee convexity of the subproblems once the neural encoder weights become free parameters. This limits the direct applicability of standard ADMM convergence theory. In the revised manuscript we will add a subsection specifying that subproblem objectives are restricted to quadratic forms whose Hessians are constrained to be positive semi-definite (via parameterization or projection steps), and we will state the resulting conditions under which ADMM convergence guarantees continue to hold. revision: yes

  2. Referee: [Abstract] Abstract and evaluation sections: the reported MNIST robustness and Sudoku solve-rate improvements are presented as consequences of the sheaf-ADMM coordination. Without verification that the unrolled iterations reach stationary points under the learned neural parameterization, these gains cannot be confidently distinguished from generic message-passing effects.

    Authors: We concur that without explicit checks that the unrolled ADMM iterations reach stationary points, it is difficult to isolate the contribution of the optimization-derived coordination. In the revision we will include additional figures and tables reporting primal and dual residual norms across training and test iterations for the MNIST and Sudoku experiments, together with a short analysis confirming that the learned parameterizations produce convergent behavior on the reported tasks. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation is self-contained against external benchmarks

full rationale

The paper introduces a differentiable Sheaf-ADMM framework in which agents solve neural-parameterized convex subproblems coordinated by cellular sheaves, with backpropagation through unrolled iterations. The abstract and described evaluations on MNIST robustness and Sudoku solve rates are presented as empirical outcomes of this architecture relative to standard CNN and MPNN baselines. No equations, claims, or performance metrics are shown to reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations; the coordination dynamics and optimization-derived structure are treated as independent contributions. The framework is therefore self-contained and externally falsifiable on the reported tasks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim rests on convexity of neural-parameterized subproblems and the existence of task-appropriate cellular sheaves; both are domain assumptions rather than derived quantities.

free parameters (1)
  • Neural encoder weights
    Parameters of the encoders that map local views to subproblem coefficients are fitted to task data.
axioms (2)
  • domain assumption Each agent's subproblem remains convex after neural parameterization
    Required for standard ADMM convergence guarantees invoked by the framework.
  • domain assumption Cellular sheaves can be chosen to encode the necessary inter-agent constraints
    The sheaf is the mechanism that turns local solutions into coordinated global behavior.
invented entities (1)
  • Cellular sheaf for heterogeneous consensus no independent evidence
    purpose: To define which solution components must agree between neighboring agents
    New modeling object introduced to allow flexible, non-uniform global consistency rules.

pith-pipeline@v0.9.1-grok · 5711 in / 1379 out tokens · 26678 ms · 2026-06-28T23:19:32.774528+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 18 canonical work pages · 2 internal anchors

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    and \"O ktem, O

    Adler, J. and \"O ktem, O. Learned primal-dual reconstruction. IEEE Transactions on Medical Imaging, 37 0 (6): 0 1322--1332, 2018. doi:10.1109/TMI.2018.2799231

  3. [3]

    Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., and Kolter, J. Z. Differentiable convex optimization layers. Advances in neural information processing systems, 32, 2019

  4. [4]

    and Kolter, J

    Amos, B. and Kolter, J. Z. OptNet : Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pp.\ 136--145, 2017

  5. [5]

    M., and Vandergheynst, P

    Arroyo, A., Gravina, A., Gutteridge, B., Barbero, F., Gallicchio, C., Dong, X., Bronstein, M. M., and Vandergheynst, P. On vanishing gradients, over-smoothing, and over-squashing in GNNs : Bridging recurrent and graph learning. In Advances in Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=N4cyRMuLyl

  6. [6]

    M., Dong, X., Li \`o , P., Pascanu, R., and Vandergheynst, P

    Arroyo, A., Barbero, F., Blayney, H., Bronstein, M. M., Dong, X., Li \`o , P., Pascanu, R., and Vandergheynst, P. A survey on over-smoothing and over-squashing: Unified propagation perspectives on graph neural networks and transformers. OpenReview, 2026. URL https://openreview.net/forum?id=H9zhC5pVnH. Accepted by Transactions on Machine Learning Research

  7. [7]

    Z., and Koltun, V

    Bai, S., Kolter, J. Z., and Koltun, V. Deep equilibrium models. Advances in Neural Information Processing Systems, 32, 2019

  8. [8]

    P., Li \`o , P., and Bronstein, M

    Bodnar, C., Di Giovanni, F., Chamberlain, B. P., Li \`o , P., and Bronstein, M. M. Neural sheaf diffusion: A topological perspective on heterophily and oversmoothing in GNN s. In Advances in Neural Information Processing Systems, 2022

  9. [9]

    Distributed optimization and statistical learning via the alternating direction method of multipliers

    Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3 0 (1): 0 1--122, 2011. doi:10.1561/2200000016

  10. [10]

    Curry, J. M. Sheaves, Cosheaves and Applications. PhD thesis, University of Pennsylvania, 2014. URL https://arxiv.org/abs/1303.3255

  11. [11]

    a usner, P., Hern \'a ndez Escobar , D., and Sj \

    Doerks, H., H \"a usner, P., Hern \'a ndez Escobar , D., and Sj \"o lund, J. Learning to accelerate distributed ADMM using graph neural networks. In Proceedings of the 8th Annual Conference on Learning for Dynamics and Control, volume 331 of Proceedings of Machine Learning Research, pp.\ 1--26. PMLR, 2026. URL https://openreview.net/forum?id=9vOQ9B6Q1k

  12. [12]

    Sheaves reloaded: A direction awakening

    Fiorini, S., Aktas, H., Duta, I., Morerio, P., Del Bue, A., Li \`o , P., and Coniglio, S. Sheaves reloaded: A direction awakening. In International Conference on Learning Representations, 2026. URL https://openreview.net/forum?id=iDiiETH7Qv

  13. [13]

    Optimal parameter selection for the alternating direction method of multipliers ( ADMM ): Quadratic problems

    Ghadimi, E., Teixeira, A., Shames, I., and Johansson, M. Optimal parameter selection for the alternating direction method of multipliers ( ADMM ): Quadratic problems. IEEE Transactions on Automatic Control, 60 0 (3): 0 644--658, 2015. doi:10.1109/TAC.2014.2354892

  14. [14]

    S., Riley, P

    Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., and Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning, pp.\ 1263--1272, 2017

  15. [15]

    and Boyd, S

    Giselsson, P. and Boyd, S. Diagonal scaling in Douglas-Rachford splitting and ADMM . In 53rd IEEE Conference on Decision and Control, pp.\ 5033--5039. IEEE, 2014. doi:10.1109/CDC.2014.7040175

  16. [16]

    Adaptive computation time for recurrent neural networks

    Graves, A. Adaptive computation time for recurrent neural networks. arXiv [cs.NE], 2016

  17. [17]

    and LeCun, Y

    Gregor, K. and LeCun, Y. Learning fast approximations of sparse coding. In Proceedings of the 27th International Conference on Machine Learning, pp.\ 399--406. Omnipress, 2010

  18. [18]

    and Tang, Y

    Ha, D. and Tang, Y. Collective intelligence for deep learning: A survey of recent developments. Collective Intelligence, 1 0 (1): 0 26339137221114874, 2022. doi:10.1177/26339137221114874

  19. [19]

    F., Bou Barcelo , J., Copeland, A., Dixon, W., and Fairbanks, J

    Hanks, T., Nino, C. F., Bou Barcelo , J., Copeland, A., Dixon, W., and Fairbanks, J. Heterogeneous multi-agent multi-target tracking using cellular sheaves, 2025 a . URL https://arxiv.org/abs/2512.24886

  20. [20]

    Distributed multi-agent coordination over cellular sheaves

    Hanks, T., Riess, H., Cohen, S., Gross, T., Hale, M., and Fairbanks, J. Distributed multi-agent coordination over cellular sheaves. In 2025 IEEE 64th Conference on Decision and Control (CDC), pp.\ 3057--3064. IEEE, 2025 b . doi:10.1109/CDC57313.2025.11312066

  21. [21]

    and Gebhart, T

    Hansen, J. and Gebhart, T. Sheaf neural networks. In NeurIPS 2020 Workshop on Topological Data Analysis and Beyond, 2020. URL https://openreview.net/forum?id=GgcgIJsT8HD

  22. [22]

    Distributed optimization with sheaf homological constraints

    Hansen, J. and Ghrist, R. Distributed optimization with sheaf homological constraints. In 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp.\ 565--571. IEEE, 2019 a . doi:10.1109/ALLERTON.2019.8919796

  23. [23]

    and Ghrist, R

    Hansen, J. and Ghrist, R. Learning sheaf laplacians from smooth signals. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\ 5446--5450. IEEE, 2019 b . doi:10.1109/ICASSP.2019.8683709

  24. [24]

    and Ghrist, R

    Hansen, J. and Ghrist, R. Toward a spectral theory of cellular sheaves. Journal of Applied and Computational Topology, 3 0 (4): 0 315--358, 2019 c . doi:10.1007/s41468-019-00038-7

  25. [25]

    and Ghrist, R

    Hansen, J. and Ghrist, R. Opinion dynamics on discourse sheaves. SIAM Journal on Applied Mathematics, 81 0 (5): 0 2033--2060, 2021. doi:10.1137/20M1341088

  26. [26]

    Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems

    Hong, M., Luo, Z.-Q., and Razaviyayn, M. Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM Journal on Optimization, 26 0 (1): 0 337--364, 2016. doi:10.1137/140990309

  27. [27]

    Gated graph sequence neural networks

    Li, Y., Tarlow, D., Brockschmidt, M., and Zemel, R. Gated graph sequence neural networks. In International Conference on Learning Representations, 2016. URL https://openreview.net/forum?id=HSgW989Kp-q

  28. [28]

    Monga, V., Li, Y., and Eldar, Y. C. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Processing Magazine, 38 0 (2): 0 18--44, 2021. doi:10.1109/MSP.2020.3016905

  29. [29]

    Growing neural cellular automata

    Mordvintsev, A., Randazzo, E., Niklasson, E., and Levin, M. Growing neural cellular automata. Distill, 2020. doi:10.23915/distill.00023. https://distill.pub/2020/growing-ca

  30. [30]

    and Shlezinger, N

    Noah, Y. and Shlezinger, N. Distributed learn-to-optimize: Limited communications optimization over networks via deep unfolded distributed ADMM . IEEE Transactions on Mobile Computing, 24 0 (4): 0 3012--3024, 2025. doi:10.1109/TMC.2024.3502574

  31. [31]

    Sudoku-Dataset

    Rastogi, R. Sudoku-Dataset . https://huggingface.co/datasets/Ritvik19/Sudoku-Dataset, 2024. Accessed: 2025-01-28

  32. [32]

    L., Belieni, J., Souza, A

    Ribeiro, A., Ten \'o rio, A. L., Belieni, J., Souza, A. H., and Mesquita, D. Cooperative sheaf neural networks. In International Conference on Learning Representations, 2026. URL https://openreview.net/forum?id=AHpexliCTM

  33. [33]

    D., Kuperman, H., Oshin, A., Abdul, A

    Saravanos, A. D., Kuperman, H., Oshin, A., Abdul, A. T., Pacelli, V., and Theodorou, E. A. Deep distributed optimization for large-scale quadratic programming. In International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=hzuumhfYSO

  34. [34]

    Sumpter, D. J. T. Collective animal behavior. Princeton University Press, 2010

  35. [35]

    Learning to coordinate: Distributed meta-trajectory optimization via differentiable ADMM - DDP

    Wang, B., Gao, Y., Sun, T., and Zhao, L. Learning to coordinate: Distributed meta-trajectory optimization via differentiable ADMM - DDP . arXiv [cs.LG], 2025 a

  36. [36]

    Hierarchical Reasoning Model

    Wang, G., Li, J., Sun, Y., Chen, X., Liu, C., Wu, Y., Lu, M., Song, S., and Abbasi Yadkori , Y. Hierarchical reasoning model, 2025 b . URL https://arxiv.org/abs/2506.21734

  37. [37]

    Global convergence of ADMM in nonconvex nonsmooth optimization

    Wang, Y., Yin, W., and Zeng, J. Global convergence of ADMM in nonconvex nonsmooth optimization. Journal of Scientific Computing, 78 0 (1): 0 29--63, 2019. doi:10.1007/s10915-018-0757-z

  38. [38]

    ADMM penalty parameter selection by residual balancing

    Wohlberg, B. ADMM penalty parameter selection by residual balancing. arXiv [math.OC], 2017

  39. [39]

    Differentiable linearized ADMM

    Xie, X., Wu, J., Zhong, Z., Liu, G., and Lin, Z. Differentiable linearized ADMM . In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp.\ 6902--6911. PMLR, 2019. URL https://proceedings.mlr.press/v97/xie19c.html

  40. [40]

    Adaptive ADMM with spectral penalty parameter selection

    Xu, Z., Figueiredo, M., and Goldstein, T. Adaptive ADMM with spectral penalty parameter selection. In Artificial Intelligence and Statistics, pp.\ 718--727. PMLR, 2017

  41. [41]

    Deep ADMM -net for compressive sensing MRI

    Yang, Y., Sun, J., Li, H., and Xu, Z. Deep ADMM -net for compressive sensing MRI . Advances in Neural Information Processing Systems, 29, 2016

  42. [42]

    Yang, Y., Guan, X., Jia, Q.-S., Yu, L., Xu, B., and Spanos, C. J. A survey of ADMM variants for distributed optimization: Problems, algorithms and features. arXiv [cs.DC], 2022

  43. [43]

    Asynchronous nonlinear sheaf diffusion for multi-agent coordination

    Zhao, Y., Hanks, T., Riess, H., Cohen, S., Hale, M., and Fairbanks, J. Asynchronous nonlinear sheaf diffusion for multi-agent coordination. arXiv [math.OC], 2025