pith. machine review for the scientific record. sign in

arxiv: 2604.06914 · v1 · submitted 2026-04-08 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Equivariant Multi-agent Reinforcement Learning for Multimodal Vehicle-to-Infrastructure Systems

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:11 UTC · model grok-4.3

classification 💻 cs.LG
keywords multi-agent reinforcement learningequivariant policiesvehicle-to-infrastructuremultimodal sensingself-supervised learninggraph neural networksdecentralized resource allocation
0
0 comments X

The pith

By aligning multimodal features self-supervised and training equivariant GNN policies in MARL, roadside units can maximize rates in V2I systems while respecting vehicle location symmetries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a decentralized approach to rate maximization in vehicle-to-infrastructure networks, where base stations collect wireless and visual data from moving vehicles. It frames the problem as multi-agent reinforcement learning that builds in rotational symmetries of vehicle positions, using self-supervised feature alignment to derive accurate local position estimates from multimodal observations. An equivariant graph neural network policy then lets each agent compute its action locally while signaling coordinates global behavior and preserves symmetry. This yields concrete gains in both sensing accuracy and network performance. A reader would care because the method shows how to make decentralized AI control more data-efficient and symmetry-aware in dynamic physical environments without relying on a central coordinator.

Core claim

The authors establish that a self-supervised multimodal sensing framework extracts vehicle positions by aligning latent features, which then feed an equivariant GNN-based MARL policy with message passing and a signaling scheme for coordination; under simulation with ray-tracing and graphics data, this delivers more than twofold accuracy gains over baselines in position estimation and more than 50 percent performance gains over standard MARL in rate maximization.

What carries the argument

Equivariant policy network using a graph neural network with message-passing layers, preceded by self-supervised alignment of multimodal latent features to extract vehicle positions.

If this is right

  • The self-supervised multimodal sensing generalizes and produces more than twofold accuracy gains in vehicle position extraction compared with baselines.
  • Equivariant MARL training produces more than 50 percent performance gains in decentralized rate maximization over non-equivariant approaches.
  • Each agent computes its policy locally while a signaling scheme overcomes partial observability and maintains global policy equivariance.
  • Rotation symmetries of vehicle locations are incorporated directly into the policy structure via the GNN architecture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the simulation symmetries persist in field deployments, the method could reduce labeled-data requirements for training large-scale V2I networks.
  • The same combination of self-supervised alignment and equivariant message passing could transfer to other geometrically symmetric multi-agent settings such as traffic signal control or drone coordination.
  • The signaling coordination layer might be adapted to handle additional uncertainties like sensor failures or changing vehicle densities.

Load-bearing premise

That the rotational symmetries of vehicle locations can be faithfully captured by the GNN policy and that the self-supervised feature alignment reliably extracts accurate positions from multimodal observations under the simulation conditions used.

What would settle it

A simulation or real-world trial in which vehicle positions break the assumed rotational symmetries or multimodal data introduces alignment errors that erase the reported accuracy and performance improvements.

Figures

Figures reproduced from arXiv: 2604.06914 by Charbel Bou Chaaya, Mehdi Bennis.

Figure 1
Figure 1. Figure 1: System model showcasing multiple BS agents, each equipped with a camera, deployed in a symmetric V2I environment. Each [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A global symmetry: when the vehicles’ positions rotate, the optimal policy is permuted between and within agents. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Two equivalent states: a global state transformation is equivalent to a rotation of local states, followed by a permutation of local [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 1
Figure 1. Figure 1: First in Section IV, we devise a self-supervised learning framework for multimodal sensing where [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Proposed self-supervised multimodal sensing framework. [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Proposed image-based localization. the location p t k of terminal k. We start by invoking the following lemma. Lemma 1. The physical location p = [px, py] relatively to camera c of pixel coordinates [lw, lh] can be estimated as: px = hc tan  ϕ Los c + tan−1  2lh − Hc Hc tan ϕ max c 2  cos  φ Los c + tan−1  2lw − Wc Wc tan φ max c 2  , (10) py = hc tan  ϕ Los c + tan−1  2lh − Hc Hc tan ϕ max c 2 … view at source ↗
Figure 6
Figure 6. Figure 6: Equivariant layer: when the input is transformed by a group action, the output permutes over the group channels. [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Proposed GNN for MARL equivariant policy training. Each agent collects multimodal observations which are processed as detailed [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Simulation setting showing the top view of the considered V2I network. A Blender scene is processed by Sionna to render wireless [PITH_FULL_IMAGE:figures/full_fig_p032_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison between different sensing frameworks. [PITH_FULL_IMAGE:figures/full_fig_p033_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Localization performance of the proposed method with partial alignment. [PITH_FULL_IMAGE:figures/full_fig_p033_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Empirical CDF of localization error for different [PITH_FULL_IMAGE:figures/full_fig_p034_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Convergence of our proposed multimodal alignment procedure for different data set sizes. Each curve is also labeled by its [PITH_FULL_IMAGE:figures/full_fig_p035_13.png] view at source ↗
Figure 15
Figure 15. Figure 15: Proposed crossmodal imputation on an environment [PITH_FULL_IMAGE:figures/full_fig_p036_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Out-of-distribution average sensing accuracy for [PITH_FULL_IMAGE:figures/full_fig_p037_16.png] view at source ↗
Figure 18
Figure 18. Figure 18: Convergence of MARL algorithms for different system parameters affecting the state and action sets. [PITH_FULL_IMAGE:figures/full_fig_p039_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Impact of different MARL training schemes using our proposed equivariant policy network and a standard non-equivariant [PITH_FULL_IMAGE:figures/full_fig_p040_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Comparison between different MARL algorithms under partial symmetry ( [PITH_FULL_IMAGE:figures/full_fig_p041_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Impact of localization error on the performance of different MARL algorithms ( [PITH_FULL_IMAGE:figures/full_fig_p042_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Energy efficiency of our multimodal system design compared to a separate design. [PITH_FULL_IMAGE:figures/full_fig_p043_22.png] view at source ↗
read the original abstract

In this paper, we study a vehicle-to-infrastructure (V2I) system where distributed base stations (BSs) acting as road-side units (RSUs) collect multimodal (wireless and visual) data from moving vehicles. We consider a decentralized rate maximization problem, where each RSU relies on its local observations to optimize its resources, while all RSUs must collaborate to guarantee favorable network performance. We recast this problem as a distributed multi-agent reinforcement learning (MARL) problem, by incorporating rotation symmetries in terms of vehicles' locations. To exploit these symmetries, we propose a novel self-supervised learning framework where each BS agent aligns the latent features of its multimodal observation to extract the positions of the vehicles in its local region. Equipped with this sensing data at each RSU, we train an equivariant policy network using a graph neural network (GNN) with message passing layers, such that each agent computes its policy locally, while all agents coordinate their policies via a signaling scheme that overcomes partial observability and guarantees the equivariance of the global policy. We present numerical results carried out in a simulation environment, where ray-tracing and computer graphics are used to collect wireless and visual data. Results show the generalizability of our self-supervised and multimodal sensing approach, achieving more than two-fold accuracy gains over baselines, and the efficiency of our equivariant MARL training, attaining more than 50% performance gains over standard approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes a self-supervised multimodal (wireless and visual) sensing framework to extract vehicle positions in a V2I system, which is then used to train an equivariant GNN-based policy for decentralized MARL that maximizes rates while exploiting rotational symmetries of vehicle locations via message passing and a signaling scheme to handle partial observability.

Significance. If the performance claims hold under rigorous validation, the work could advance decentralized MARL for wireless networks by showing how self-supervision on multimodal data and symmetry-aware GNN policies improve generalization and efficiency in V2I rate optimization. The reported gains suggest practical value for intelligent transportation systems, though the simulation-only evaluation limits broader claims.

major comments (3)
  1. [Abstract] Abstract: the claim of 'more than two-fold accuracy gains over baselines' for the self-supervised multimodal sensing provides no description of the baselines, the position accuracy metric (e.g., RMSE or IoU), ablation studies separating multimodal alignment from single-modality inputs, or statistical significance tests; this is load-bearing for the generalizability assertion.
  2. [Abstract] Abstract / Numerical Results: the claim of 'more than 50% performance gains over standard approaches' in the equivariant MARL lacks specification of the standard approaches (e.g., non-equivariant MARL or centralized RL), sensitivity analysis to simulation parameters such as vehicle density or ray-tracing settings, and any ablation on the signaling scheme; without these the efficiency claim cannot be assessed.
  3. [Proposed framework] Proposed framework (self-supervised alignment): the latent-feature matching objective is asserted to recover accurate metric vehicle positions without explicit labels, yet no analysis or test is provided showing robustness outside the specific ray-tracing/graphics simulator (e.g., under changed antenna patterns, lighting, or densities); this directly affects whether the downstream MARL gains are artifacts of the simulation.
minor comments (1)
  1. [Abstract] The abstract would benefit from a brief parenthetical definition of 'equivariant policy' and 'signaling scheme' for readers outside MARL.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that clarifying the abstract claims and providing additional robustness analysis will strengthen the manuscript. We respond to each major comment below and will make the indicated revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of 'more than two-fold accuracy gains over baselines' for the self-supervised multimodal sensing provides no description of the baselines, the position accuracy metric (e.g., RMSE or IoU), ablation studies separating multimodal alignment from single-modality inputs, or statistical significance tests; this is load-bearing for the generalizability assertion.

    Authors: We acknowledge that the abstract is concise and omits these details. The manuscript body describes the baselines as single-modality sensing and supervised methods, uses RMSE as the position accuracy metric, presents ablations separating multimodal alignment from single-modality inputs, and reports statistical significance via repeated runs with confidence intervals. In the revised manuscript we will expand the abstract to concisely specify the baselines, metric, reference the ablations, and note the significance testing. revision: yes

  2. Referee: [Abstract] Abstract / Numerical Results: the claim of 'more than 50% performance gains over standard approaches' in the equivariant MARL lacks specification of the standard approaches (e.g., non-equivariant MARL or centralized RL), sensitivity analysis to simulation parameters such as vehicle density or ray-tracing settings, and any ablation on the signaling scheme; without these the efficiency claim cannot be assessed.

    Authors: We agree that explicit specification is needed. The standard approaches are non-equivariant GNN-based MARL and centralized RL, as evaluated in the numerical results. We will revise the abstract to name them. We will also add sensitivity analysis varying vehicle density and ray-tracing parameters together with an ablation isolating the signaling scheme in the updated numerical results section. revision: yes

  3. Referee: [Proposed framework] Proposed framework (self-supervised alignment): the latent-feature matching objective is asserted to recover accurate metric vehicle positions without explicit labels, yet no analysis or test is provided showing robustness outside the specific ray-tracing/graphics simulator (e.g., under changed antenna patterns, lighting, or densities); this directly affects whether the downstream MARL gains are artifacts of the simulation.

    Authors: The evaluation uses a ray-tracing and graphics simulator that permits controlled variation of parameters. We will add experiments in the revised manuscript that vary vehicle densities, antenna array configurations, and lighting conditions within the simulator to test robustness of the latent-feature matching. These additions will show that the position recovery and downstream MARL gains are not tied to one fixed simulator configuration. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical results from training, not tautological derivations

full rationale

The paper frames its contributions as a self-supervised multimodal alignment step feeding into an equivariant GNN-MARL policy, with all headline gains (>2× accuracy, >50% performance) reported as simulation outcomes under ray-tracing and graphics models. No equations, fitted parameters, or uniqueness theorems are presented that reduce predictions to inputs by construction. The derivation relies on standard MARL value functions, GNN message passing for equivariance, and a latent-feature matching objective; these are trained end-to-end rather than defined circularly. No self-citation chains or ansatzes are load-bearing for the central claims. The reader's assessment of score 2 is consistent with minor normal self-citation at most, but the chain remains independent and falsifiable via external simulation benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard reinforcement-learning modeling assumptions and the existence of exploitable rotational symmetry in vehicle locations; no new free parameters, physical constants, or invented entities are introduced beyond conventional GNN and MARL components.

axioms (2)
  • domain assumption The V2I resource allocation problem can be modeled as a partially observable Markov decision process amenable to distributed MARL.
    The problem is explicitly recast as a distributed multi-agent reinforcement learning problem.
  • domain assumption Rotational symmetries in vehicle locations can be incorporated into policy networks without loss of optimality.
    The method is built around incorporating rotation symmetries in terms of vehicles' locations.

pith-pipeline@v0.9.0 · 5561 in / 1436 out tokens · 50587 ms · 2026-05-10T18:11:04.340136+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 6 canonical work pages · 2 internal anchors

  1. [1]

    A vision of 6g wireless systems: Applications, trends, technologies, and open research problems,

    W. Saad, M. Bennis, and M. Chen, “A vision of 6g wireless systems: Applications, trends, technologies, and open research problems,” IEEE network, vol. 34, no. 3, pp. 134–142, 2019

  2. [2]

    Twelve scientific challenges for 6g: Rethinking the foundations of communications theory,

    M. Chafii, L. Bariah, S. Muhaidat, and M. Debbah, “Twelve scientific challenges for 6g: Rethinking the foundations of communications theory,”IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 868–904, 2023

  3. [3]

    Wireless communications and applications above 100 ghz: Opportunities and challenges for 6g and beyond,

    T. S. Rappaport, Y . Xing, O. Kanhere, S. Ju, A. Madanayake, S. Mandal, A. Alkhateeb, and G. C. Trichopoulos, “Wireless communications and applications above 100 ghz: Opportunities and challenges for 6g and beyond,”IEEE access, vol. 7, pp. 78 729– 78 757, 2019

  4. [4]

    Large generative AI models for telecom: The next big thing?

    L. Bariah, Q. Zhao, H. Zou, Y . Tian, F. Bader, and M. Debbah, “Large language models for telecom: The next big thing?”arXiv preprint arXiv:2306.10249, 2023

  5. [5]

    Going beyond rf: A survey on how ai-enabled multimodal beamforming will shape the nextg standard,

    D. Roy, B. Salehi, S. Banou, S. Mohanti, G. Reus-Muns, M. Belgiovine, P. Ganesh, C. Dick, and K. Chowdhury, “Going beyond rf: A survey on how ai-enabled multimodal beamforming will shape the nextg standard,”Computer Networks, vol. 228, p. 109729, 2023

  6. [6]

    Beam design for beam switching based millimeter wave vehicle-to-infrastructure communications,

    V . Va, T. Shimizu, G. Bansal, and R. W. Heath, “Beam design for beam switching based millimeter wave vehicle-to-infrastructure communications,” in2016 IEEE International Conference on Communications (ICC). IEEE, 2016, pp. 1–6

  7. [7]

    Applying deep-learning-based computer vision to wireless communications: Methodologies, opportunities, and challenges,

    Y . Tian, G. Pan, and M.-S. Alouini, “Applying deep-learning-based computer vision to wireless communications: Methodologies, opportunities, and challenges,”IEEE Open Journal of the Communications Society, vol. 2, pp. 132–143, 2020

  8. [8]

    Sensing aided reconfigurable intelligent surfaces for 3gpp 5g transparent operation,

    S. Jiang, A. Hindy, and A. Alkhateeb, “Sensing aided reconfigurable intelligent surfaces for 3gpp 5g transparent operation,”IEEE Transactions on Communications, pp. 1–1, 2023

  9. [9]

    Lidar aided future beam prediction in real-world millimeter wave v2i communications,

    S. Jiang, G. Charan, and A. Alkhateeb, “Lidar aided future beam prediction in real-world millimeter wave v2i communications,”IEEE Wireless Communications Letters, vol. 12, no. 2, pp. 212–216, 2022

  10. [10]

    Vision-aided 6g wireless communications: Blockage prediction and proactive handoff,

    G. Charan, M. Alrabeiah, and A. Alkhateeb, “Vision-aided 6g wireless communications: Blockage prediction and proactive handoff,” IEEE Transactions on Vehicular Technology, vol. 70, no. 10, pp. 10 193–10 208, 2021

  11. [11]

    Vision-position multi-modal beam prediction using real millimeter wave datasets,

    G. Charan, T. Osman, A. Hredzak, N. Thawdar, and A. Alkhateeb, “Vision-position multi-modal beam prediction using real millimeter wave datasets,” in2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 2727–2731. 52

  12. [12]

    Deep learning on visual and location data for v2i mmwave beamforming,

    G. Reus-Muns, B. Salehi, D. Roy, T. Jian, Z. Wang, J. Dy, S. Ioannidis, and K. Chowdhury, “Deep learning on visual and location data for v2i mmwave beamforming,” in2021 17th International Conference on Mobility, Sensing and Networking (MSN), 2021, pp. 559–566

  13. [13]

    Deep learning on multimodal sensor data at the wireless edge for vehicular network,

    B. Salehi, G. Reus-Muns, D. Roy, Z. Wang, T. Jian, J. Dy, S. Ioannidis, and K. Chowdhury, “Deep learning on multimodal sensor data at the wireless edge for vehicular network,”IEEE Transactions on Vehicular Technology, vol. 71, no. 7, pp. 7639–7655, 2022

  14. [14]

    Environment semantic communication: Enabling distributed sensing aided networks,

    S. Imran, G. Charan, and A. Alkhateeb, “Environment semantic communication: Enabling distributed sensing aided networks,”IEEE Open Journal of the Communications Society, 2024

  15. [15]

    Multi-modal data based semi-supervised learning for vehicle positioning,

    O. Huan, Y . Yang, T. Luo, and M. Chen, “Multi-modal data based semi-supervised learning for vehicle positioning,”IEEE Transactions on Communications, 2024

  16. [16]

    Multi-modal image and radio frequency fusion for optimizing vehicle positioning,

    O. Huan, T. Luo, and M. Chen, “Multi-modal image and radio frequency fusion for optimizing vehicle positioning,”IEEE Transactions on Mobile Computing, 2024

  17. [17]

    Self-supervised radio-visual representation learning for 6g sensing,

    M. Alloulah, A. D. Singh, and M. Arnold, “Self-supervised radio-visual representation learning for 6g sensing,” inICC 2022-IEEE International Conference on Communications. IEEE, 2022, pp. 1955–1961

  18. [18]

    Mdp homomorphic networks: Group symmetries in reinforcement learning,

    E. Van der Pol, D. Worrall, H. van Hoof, F. Oliehoek, and M. Welling, “Mdp homomorphic networks: Group symmetries in reinforcement learning,”Advances in Neural Information Processing Systems, vol. 33, pp. 4199–4210, 2020

  19. [19]

    Boosting multiagent reinforcement learning via permutation invariant and permutation equivariant networks,

    H. Jianye, X. Hao, H. Mao, W. Wang, Y . Yang, D. Li, Y . Zheng, and Z. Wang, “Boosting multiagent reinforcement learning via permutation invariant and permutation equivariant networks,” inThe Eleventh International Conference on Learning Representations, 2022

  20. [20]

    Esp: Exploiting symmetry prior for multi-agent reinforcement learning,

    X. Yu, R. Shi, P. Feng, Y . Tian, J. Luo, and W. Wu, “Esp: Exploiting symmetry prior for multi-agent reinforcement learning,” inECAI

  21. [21]

    2946–2953

    IOS Press, 2023, pp. 2946–2953

  22. [22]

    Multi-agent MDP homomorphic networks,

    E. van der Pol, H. van Hoof, F. A. Oliehoek, and M. Welling, “Multi-agent MDP homomorphic networks,” inInternational Conference on Learning Representations, 2022

  23. [23]

    Symmetry-augmented multi-agent reinforcement learning for scalable uav trajectory design and user scheduling,

    X. Zhou, J. Xiong, H. Zhao, C. Yan, and J. Wei, “Symmetry-augmented multi-agent reinforcement learning for scalable uav trajectory design and user scheduling,”IEEE Transactions on Mobile Computing, 2024

  24. [24]

    Symmetry-informed marl: A decentralized and cooperative uav swarm control approach for communication coverage,

    R. Shi, X. Yu, Y . Wang, Y . Tian, Z. Liu, W. Wu, X.-P. Zhang, and M. M. Veloso, “Symmetry-informed marl: A decentralized and cooperative uav swarm control approach for communication coverage,”IEEE Transactions on Mobile Computing, 2025

  25. [25]

    Communication-efficient multimodal split learning for mmwave received power prediction,

    Y . Koda, J. Park, M. Bennis, K. Yamamoto, T. Nishio, M. Morikura, and K. Nakashima, “Communication-efficient multimodal split learning for mmwave received power prediction,”IEEE Communications Letters, vol. 24, no. 6, pp. 1284–1288, 2020

  26. [26]

    Computer vision aided beam tracking in a real-world millimeter wave deployment,

    S. Jiang and A. Alkhateeb, “Computer vision aided beam tracking in a real-world millimeter wave deployment,” in2022 IEEE Globecom Workshops (GC Wkshps), 2022, pp. 142–147

  27. [27]

    Sionna: An Open-Source Library for Next-Generation Physical Layer Research,

    J. Hoydis, S. Cammerer, F. A. Aoudia, A. Vem, N. Binder, G. Marcus, and A. Keller, “Sionna: An open-source library for next-generation physical layer research,”arXiv preprint arXiv:2203.11854, 2022

  28. [28]

    Emergent communication in multi-agent reinforcement learning for future wireless networks,

    M. Chafii, S. Naoumi, R. Alami, E. Almazrouei, M. Bennis, and M. Debbah, “Emergent communication in multi-agent reinforcement learning for future wireless networks,”IEEE Internet of Things Magazine, vol. 6, no. 4, pp. 18–24, 2023

  29. [29]

    C. C. Pinter,A book of abstract algebra. Courier Corporation, 2010

  30. [31]

    Harmonic networks: Deep translation and rotation equivariance,

    D. E. Worrall, S. J. Garbin, D. Turmukhambetov, and G. J. Brostow, “Harmonic networks: Deep translation and rotation equivariance,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5028–5037. 53

  31. [32]

    Learning symmetric embeddings for equivariant world models,

    J. Y . Park, O. Biza, L. Zhao, J.-W. Van De Meent, and R. Walters, “Learning symmetric embeddings for equivariant world models,” in Proceedings of the 39th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., vol. 162. PMLR, 17–23 Jul 202...

  32. [33]

    Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,

    C.-Y . Wang, A. Bochkovskiy, and H.-Y . M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 7464– 7475

  33. [34]

    Channel charting: Locating users within the radio environment using channel state information,

    C. Studer, S. Medjkouh, E. Gonulta¸ s, T. Goldstein, and O. Tirkkonen, “Channel charting: Locating users within the radio environment using channel state information,”IEEE Access, vol. 6, pp. 47 682–47 698, 2018

  34. [35]

    Angle-delay profile-based and timestamp-aided dissimilarity metrics for channel charting,

    P. Stephan, F. Euchner, and S. Ten Brink, “Angle-delay profile-based and timestamp-aided dissimilarity metrics for channel charting,” IEEE Transactions on Communications, 2024

  35. [36]

    Anote on two problems in connection with graphs,

    E. Dijkstra, “Anote on two problems in connection with graphs,”Numer Math, vol. 1, pp. 101–118, 1959

  36. [37]

    Generalized unsupervised manifold alignment,

    Z. Cui, H. Chang, S. Shan, and X. Chen, “Generalized unsupervised manifold alignment,”Advances in Neural Information Processing Systems, vol. 27, 2014

  37. [38]

    Unsupervised topological alignment for single-cell multi-omics integration,

    K. Cao, X. Bai, Y . Hong, and L. Wan, “Unsupervised topological alignment for single-cell multi-omics integration,”Bioinformatics, vol. 36, no. Supplement_1, pp. i48–i56, 2020

  38. [39]

    Joint variational autoencoders for multimodal imputation and embedding,

    N. Cohen Kalafut, X. Huang, and D. Wang, “Joint variational autoencoders for multimodal imputation and embedding,”Nature Machine Intelligence, pp. 1–12, 2023

  39. [40]

    Channel charting in real-world coordinates with distributed mimo,

    S. Taner, V . Palhares, and C. Studer, “Channel charting in real-world coordinates with distributed mimo,”IEEE Transactions on Wireless Communications, 2025

  40. [41]

    Group equivariant convolutional networks,

    T. Cohen and M. Welling, “Group equivariant convolutional networks,” inInternational conference on machine learning. PMLR, 2016, pp. 2990–2999

  41. [42]

    Relational inductive biases, deep learning, and graph networks

    P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkneret al., “Relational inductive biases, deep learning, and graph networks,”arXiv preprint arXiv:1806.01261, 2018

  42. [43]

    Neural enhanced belief propagation on factor graphs,

    V . G. Satorras and M. Welling, “Neural enhanced belief propagation on factor graphs,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2021, pp. 685–693

  43. [44]

    & Baroni, M

    A. Lazaridou and M. Baroni, “Emergent multi-agent communication in the deep learning era,”arXiv preprint arXiv:2006.02419, 2020

  44. [45]

    Lightweight deep learning for resource-constrained environments: A survey,

    H.-I. Liu, M. Galindo, H. Xie, L.-K. Wong, H.-H. Shuai, Y .-H. Li, and W.-H. Cheng, “Lightweight deep learning for resource-constrained environments: A survey,”ACM Computing Surveys, vol. 56, no. 10, pp. 1–42, 2024

  45. [46]

    Optimal brain compression: A framework for accurate post-training quantization and pruning,

    E. Frantar and D. Alistarh, “Optimal brain compression: A framework for accurate post-training quantization and pruning,”Advances in Neural Information Processing Systems, vol. 35, pp. 4475–4488, 2022

  46. [47]

    Efficient acceleration of deep learning inference on resource-constrained edge devices: A review,

    M. M. H. Shuvo, S. K. Islam, J. Cheng, and B. I. Morshed, “Efficient acceleration of deep learning inference on resource-constrained edge devices: A review,”Proceedings of the IEEE, vol. 111, no. 1, pp. 42–91, 2023

  47. [48]

    Proximal Policy Optimization Algorithms

    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017

  48. [49]

    arXiv preprint arXiv:2001.08317 , year=

    N. Wu, B. Green, X. Ben, and S. O’Banion, “Deep transformer models for time series forecasting: The influenza prevalence case,” arXiv preprint arXiv:2001.08317, 2020

  49. [50]

    Learning to forget: continual prediction with lstm,

    F. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: continual prediction with lstm,” in1999 Ninth International Conference on Artificial Neural Networks ICANN 99. (Conf. Publ. No. 470), vol. 2, 1999, pp. 850–855 vol.2

  50. [51]

    Multi-agent actor-critic for mixed cooperative-competitive environments,

    R. Lowe, Y . I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,”Advances in neural information processing systems, vol. 30, 2017. 54

  51. [52]

    Monotonic value function factorisation for deep multi-agent reinforcement learning,

    T. Rashid, M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson, “Monotonic value function factorisation for deep multi-agent reinforcement learning,”Journal of Machine Learning Research, vol. 21, no. 178, pp. 1–51, 2020

  52. [53]

    Symmetries and model minimization in markov decision processes,

    B. Ravindran and A. G. Barto, “Symmetries and model minimization in markov decision processes,” University of Massachusetts, Tech. Rep., 2001

  53. [54]

    Leveraging partial symmetry for multi-agent reinforcement learning,

    X. Yu, R. Shi, P. Feng, Y . Tian, S. Li, S. Liao, and W. Wu, “Leveraging partial symmetry for multi-agent reinforcement learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 16, 2024, pp. 17 583–17 590

  54. [55]

    The matrix cookbook,

    K. B. Petersen, M. S. Pedersenet al., “The matrix cookbook,”Technical University of Denmark, vol. 7, no. 15, p. 510, 2008