Recognition: 2 theorem links
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Pith reviewed 2026-05-13 02:33 UTC · model grok-4.3
The pith
Geometric principles provide a unified framework for CNNs, RNNs, GNNs, and Transformers while enabling the design of new architectures.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that a geometric unification in the spirit of the Erlangen program furnishes a common mathematical framework for studying successful neural network architectures including CNNs, RNNs, GNNs, and Transformers, while simultaneously offering a constructive procedure to embed prior physical knowledge into neural architectures and to create future ones in a principled manner.
What carries the argument
The geometric structures corresponding to grids, groups, graphs, geodesics, and gauges, which encode symmetries and regularities of data domains to define appropriate neural network operations.
If this is right
- CNNs on image grids are special cases of group-equivariant networks on the appropriate symmetry group.
- Graph neural networks arise naturally when the data domain is a graph with its automorphism group.
- Transformers can be viewed as operating on sets or sequences with permutation or other symmetries.
- New models for data on manifolds or with gauge symmetries can be derived systematically rather than by trial and error.
Where Pith is reading between the lines
- Applying this lens could help identify which tasks are currently under-served by existing architectures due to mismatched geometric assumptions.
- It suggests that improvements in one domain, such as better group convolutions, might transfer to others through the shared framework.
- Scientific applications in physics and biology might benefit most, as their data often has explicit geometric structure.
- Over time, this could shift machine learning from architecture search to geometry-informed design.
Load-bearing premise
The majority of interesting learning tasks possess essential pre-defined regularities that originate from the low-dimensional structure of the physical world and that can be captured by geometric principles.
What would settle it
Demonstrating a task with strong physical structure where no geometric neural network architecture matches or exceeds the performance of a generic black-box model would challenge the claim that geometric unification is broadly useful.
read the original abstract
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learning, whereby adapted, often hierarchical, features capture the appropriate notion of regularity for each task, and second, learning by local gradient-descent type methods, typically implemented as backpropagation. While learning generic functions in high dimensions is a cursed estimation problem, most tasks of interest are not generic, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world. This text is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of applications. Such a 'geometric unification' endeavour, in the spirit of Felix Klein's Erlangen Program, serves a dual purpose: on one hand, it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers. On the other hand, it gives a constructive procedure to incorporate prior physical knowledge into neural architectures and provide principled way to build future architectures yet to be invented.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a survey articulating a geometric unification of deep learning architectures (CNNs on grids, GNNs on graphs, Transformers on gauges, etc.) in the spirit of Klein's Erlangen Program. It argues that successful models exploit pre-defined regularities arising from the low-dimensional structure of the physical world, providing both a retrospective common mathematical framework for existing architectures and a constructive procedure for incorporating prior physical knowledge into new designs.
Significance. If the unifying geometric lens holds, the survey offers a significant organizing principle for the field by linking disparate architectures through group theory, differential geometry, and symmetry considerations. It synthesizes established literature without new empirical claims, supplies design heuristics grounded in physical priors, and could guide future architecture development; the absence of free parameters, invented entities, or circular derivations strengthens its value as a reference.
Simulated Author's Rebuttal
We thank the referee for the positive and insightful review, which accurately summarizes the manuscript's goals of providing a geometric unification of deep learning architectures in the spirit of Klein's Erlangen Program. We appreciate the recognition of its value as a reference and organizing principle for the field.
Circularity Check
No significant circularity; survey organizes external results
full rationale
The manuscript is a survey that retrospectively organizes CNNs, GNNs, Transformers and related architectures under an Erlangen-style geometric lens drawn from standard group theory and differential geometry. It states its motivating assumption about physical regularities explicitly and offers design heuristics rather than new theorems or fitted predictions. No load-bearing step reduces by construction to a quantity defined inside the paper or to a self-citation chain; all cited results are independent external literature. The derivation chain is therefore self-contained.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Most tasks of interest come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world.
- domain assumption Geometric principles (grids, groups, graphs, geodesics, gauges) can be applied throughout a wide spectrum of applications to expose these regularities.
Lean theorems connected to this paper
-
AlexanderDuality (for D=3 linking and invariance)alexander_duality_circle_linking echoesSymmetries, Representations, and Invariance... Isomorphisms and Automorphisms... Deformation Stability... Scale Separation
Forward citations
Cited by 29 Pith papers
-
The WidthWall: A Strict Expressivity Hierarchy for Hypergraph Neural Networks
Hypergraph neural networks obey a strict expressivity hierarchy indexed by hypertree width, creating a Width Wall that no fixed-depth model, hidden dimension, or training procedure can cross for wider patterns.
-
Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.
-
Gradient-Based Program Synthesis with Neurally Interpreted Languages
NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...
-
The Cartesian Shortcut: Re-evaluate Vision Reasoning in Polar Coordinate Space
MLLMs scoring 70-83% on Cartesian visual tasks drop to 31-39% on logically equivalent polar versions, exposing reliance on grid discretization shortcuts instead of topology-invariant reasoning.
-
TokaMind for Power Grid: Cross-Domain Transfer from Fusion Plasma
TokaMind, pre-trained on MAST tokamak data, transfers to power grid PMU data for severe event classification with F1 0.837, where difficulty depends on grid topology and CSD indicators boost early-warning performance ...
-
Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity
Every fixed finite feedforward neural network definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting.
-
Operator-Guided Invariance Learning for Continuous Reinforcement Learning
VPSD-RL discovers exact and approximate value-preserving Lie-group operators in continuous RL to stabilize learning via transition augmentation and consistency regularization.
-
Reentrant value fields as delayed coupled reaction-diffusion systems on finite graphs
A field theory of synthetic cognition is cast as a retarded functional differential equation on graphs, with proofs of well-posedness, compact global attractor existence, delay-independent stability under a coupling-s...
-
Reentrant value fields as delayed coupled reaction-diffusion systems on finite graphs
Establishes well-posedness, compact global attractors, and delay-independent global stability for retarded functional differential equations modeling reentrant value fields as coupled reaction-diffusion systems on fin...
-
Cardiac Mesh Flow: One-Step Generation of 3D+t Cardiac Four-Chamber Meshes via Flow Matching
Cardiac Mesh Flow generates 3D+t four-chamber cardiac meshes with anatomical correspondence and volume conditioning via one-step flow matching on multi-scale deformation fields.
-
Data-driven discovery of polynomial ODEs with provably bounded solutions
SILAS jointly optimizes polynomial ODE vector fields and polynomial Lyapunov functions from data to produce models with provably bounded trajectories via compact absorbing sets.
-
Complex-Valued GNNs for Distributed Basis-Invariant Control of Planar Systems
Complex-valued GNNs using phase-equivariant activations achieve global basis invariance for distributed planar control, outperforming real-valued baselines in data efficiency, tracking, and generalization on flocking.
-
Consistent Geometric Deep Learning via Hilbert Bundles and Cellular Sheaves
HilbNets discretize Hilbert bundle convolutions through Hilbert Cellular Sheaves whose Laplacians converge to the continuous connection Laplacian, enabling consistent learning across samplings.
-
LINC: Decoupling Local Consequence Scoring from Hidden Matching in Constructive Neural Routing
LINC decouples local consequence scoring from hidden matching in constructive neural routing solvers, cutting CVRPTW gaps for PolyNet from 13.83%/38.15% to 7.26%/14.71% on Solomon/Homberger benchmarks.
-
Temporal Reasoning Is Not the Bottleneck: A Probabilistic Inconsistency Framework for Neuro-Symbolic QA
Temporal reasoning is not the core bottleneck for LLMs on time-based QA; the real issue is unstructured text-to-event mapping, addressed by a neuro-symbolic system with PIS that reaches 100% accuracy on benchmarks whe...
-
Symmetry-Protected Lyapunov Neutral Modes in Equivariant Recurrent Networks
Exact equivariance under a Lie group guarantees at least dim(G/H) zero Lyapunov exponents tangent to the group orbit on compact invariant sets with nondegenerate orbit bundles.
-
Geometric Quantum Physics Informed Neural Network
GQPINNs add symmetry awareness to quantum PINNs via equivariant circuits, yielding lower mean absolute error and fewer parameters than standard QPINNs on linear and nonlinear PDE benchmarks.
-
Leveraging Data Symmetries to Select an Optimal Subset of Training Data under Label Noise
Exploiting data symmetries boosts k-NN to select near-optimal low-noise subsets from noisy datasets, approaching Bayes-optimal performance in high dimensions, with learned representations aiding partial symmetry knowledge.
-
Scale-Aware Adversarial Analysis: A Diagnostic for Generative AI in Multiscale Complex Systems
A new scale-aware diagnostic framework shows that unconstrained diffusion generative models exhibit structural freezing and instability instead of smooth physical responses under multiscale perturbations.
-
Stability Enhanced Gaussian Process Variational Autoencoders
SEGP-VAE learns stable low-dimensional LTI systems from video data by deriving GP mean and covariance from LTI equations and using a complete unconstrained parametrization of semi-contracting systems.
-
Toward a universal foundation model for graph-structured data
A pretrained graph model using feature-agnostic structural prompts matches or exceeds supervised baselines and shows strong zero-shot and few-shot transfer on held-out biomedical graphs, with a 21.8% ROC-AUC gain on SagePPI.
-
LAG-XAI: A Lie-Inspired Affine Geometric Framework for Interpretable Paraphrasing in Transformer Latent Spaces
LAG-XAI treats paraphrasing as affine flows in semantic manifolds using Lie-inspired approximations, achieving AUC 0.7713 on paraphrase detection and 95.3% hallucination detection on HaluEval.
-
Metriplector: From Field Theory to Neural Architecture
Metriplector treats neural computation as coupled metriplectic field dynamics whose stress-energy tensor readout achieves competitive results on vision, control, Sudoku, language modeling, and pathfinding with small p...
-
PhysEDA: Physics-Aware Learning Framework for Efficient EDA With Manhattan Distance Decay
PhysEDA folds separable Manhattan-distance exponential decay into linear attention and potential-based rewards, cutting complexity to linear while improving zero-shot transfer and sparse-reward performance on decoupli...
-
The Role of Node Features in Graph Pooling
Pooling improves graph classification only when node features align well with topology, and the authors provide a quantitative measure of this alignment quality.
-
Coupled Arnol'd cat maps on circulant graphs
Coupled Arnol'd cat maps on circulant graphs produce entropy independent of connectivity because translational symmetry cancels the expected increase from added links.
-
Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds
Aligning the DDIM forward diffusion process with flow-matching manifold evolution enables high-quality generation without time conditioning, and class-conditional synthesis is possible with an unconditional denoiser b...
-
Bridging the Dimensionality Gap: A Taxonomy and Survey of 2D Vision Model Adaptation for 3D Analysis
The paper offers a taxonomy of 2D-to-3D adaptation strategies divided into data-centric projection, architecture-centric 3D networks, and hybrid methods that combine both.
-
The Topology of Multimodal Fusion: Why Current Architectures Fail at Creative Cognition
Multimodal AI architectures share a geometric prior of modal separability termed contact topology that prevents creative cognition, addressable via a cruciform philosophical-mathematical framework using fiber bundles ...
Reference graph
Works this paper leans on
-
[1]
Uri Alon and Eran Yahav. On the bottleneck of graph neural networks and its practical implications.arXiv:2006.05205,
-
[2]
Cormorant: Covariant molecular neural networks.arXiv:1906.04015,
BrandonAnderson,Truong-SonHy,andRisiKondor. Cormorant: Covariant molecular neural networks.arXiv:1906.04015,
-
[3]
JimmyLeiBa,JamieRyanKiros,andGeoffreyEHinton. Layernormalization. arXiv:1607.06450,
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate.arXiv:1409.0473,
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
Bassam Bamieh. Discovering transforms: A tutorial on circulant matrices, circular convolution, and the discrete fourier transform.arXiv:1805.05533,
-
[6]
Interaction networks for learning about objects, relations and physics.arXiv:1612.00222,
PeterWBattaglia,RazvanPascanu,MatthewLai,DaniloRezende,andKoray Kavukcuoglu. Interaction networks for learning about objects, relations and physics.arXiv:1612.00222,
-
[7]
Relational inductive biases, deep learning, and graph networks
PeterWBattaglia,JessicaBHamrick,VictorBapst,AlvaroSanchez-Gonzalez, ViniciusZambaldi,MateuszMalinowski,AndreaTacchetti,DavidRaposo, Adam Santoro, Ryan Faulkner, et al. Relational inductive biases, deep learning, and graph networks.arXiv:1806.01261,
work page internal anchor Pith review Pith/arXiv arXiv
-
[8]
Dominique Beaini, Saro Passaro, Vincent Létourneau, William L Hamil- ton, Gabriele Corso, and Pietro Liò. Directional graph networks. arXiv:2010.02863,
-
[9]
Size-invariant graph representations for graph classification extrapolations.arXiv:2103.05045,
Beatrice Bevilacqua, Yangze Zhou, and Bruno Ribeiro. Size-invariant graph representations for graph classification extrapolations.arXiv:2103.05045,
-
[10]
Weisfeiler and lehman go topo- logical: Message passing simplicial networks.arXiv:2103.03212,
Cristian Bodnar, Fabrizio Frasca, Yu Guang Wang, Nina Otter, Guido Mon- túfar, Pietro Liò, and Michael Bronstein. Weisfeiler and lehman go topo- logical: Message passing simplicial networks.arXiv:2103.03212,
-
[11]
Learning shape correspondence with anisotropic convolutional neural networks
Davide Boscaini, Jonathan Masci, Emanuele Rodoià, and Michael Bronstein. Learning shape correspondence with anisotropic convolutional neural networks. InNIPS, 2016a. Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Michael M Bronstein, and Daniel Cremers. Anisotropic diffusion descriptors.Computer Graphics Forum, 35(2):431–441, 2016b. Sébastien Bougleux, ...
-
[12]
Improving graph neural network expressivity via subgraph isomor- phism counting.arXiv:2006.09252,
Giorgos Bouritsas, Fabrizio Frasca, Stefanos Zafeiriou, and Michael M Bron- stein. Improving graph neural network expressivity via subgraph isomor- phism counting.arXiv:2006.09252,
-
[13]
Language Models are Few-Shot Learners
Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Ka- plan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. arXiv:2005.14165,
work page internal anchor Pith review Pith/arXiv arXiv 2005
-
[14]
Combinatorial optimization and reasoning with graph neural networks.arXiv:2102.09544,
Quentin Cappart, Didier Chételat, Elias Khalil, Andrea Lodi, Christopher Morris, and Petar VeliŁković. Combinatorial optimization and reasoning with graph neural networks.arXiv:2102.09544,
-
[15]
https://arxiv.org/abs/1806.07366
Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations.arXiv:1806.07366,
-
[16]
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bah- danau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078,
work page internal anchor Pith review Pith/arXiv arXiv
-
[17]
Taco S Cohen, Mario Geiger, Jonas Köhler, and Max Welling. Spherical cnns. arXiv:1801.10130,
-
[18]
Recurrent batch normalization.arXiv:1603.09025,
Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, and Aaron Courville. Recurrent batch normalization.arXiv:1603.09025,
-
[19]
Principal neighbourhood aggregation for graph nets
Gabriele Corso, Luca Cavalleri, Dominique Beaini, Pietro Liò, and Petar VeliŁković. Principal neighbourhood aggregation for graph nets. arXiv:2004.05718,
-
[20]
Lagrangian neural networks.arXiv:2003.04630,
Miles Cranmer, Sam Greydanus, Stephan Hoyer, Peter Battaglia, David Spergel, and Shirley Ho. Lagrangian neural networks.arXiv:2003.04630,
-
[21]
Learningsymbolic physics with graph networks.arXiv:1909.05862,
MilesDCranmer,RuiXu,PeterBattaglia,andShirleyHo. Learningsymbolic physics with graph networks.arXiv:1909.05862,
-
[22]
Xlvin: executed latent value iteration nets
BIBLIOGRAPHY 135 Andreea Deac, Petar VeliŁković, Ognjen Milinković, Pierre-Luc Bacon, Jian Tang, and Mladen Nikolić. Xlvin: executed latent value iteration nets. arXiv:2010.13146,
-
[23]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understand- ing. arXiv:1810.04805,
work page internal anchor Pith review Pith/arXiv arXiv
-
[24]
A generalization of transformer networks to graphs.arXiv preprint arXiv:2012.09699, 2020a
Vijay Prakash Dwivedi and Xavier Bresson. A generalization of transformer networks to graphs.arXiv:2012.09699,
-
[25]
Spin-weighted spherical CNNs.arXiv:2006.10731,
Carlos Esteves, Ameesh Makadia, and Kostas Daniilidis. Spin-weighted spherical CNNs.arXiv:2006.10731,
-
[26]
Hierarchical inter-message passing for learning on molecular graphs.arXiv:2006.12179,
Matthias Fey, Jan-Gin Yuen, and Frank Weichert. Hierarchical inter-message passing for learning on molecular graphs.arXiv:2006.12179,
-
[27]
Neural shuffle-exchange networks–sequence processing in o (n log n) time.arXiv:1907.07897,
K¯arlis Freivalds, Em¯ıls Ozolin,š, and Agris ’ostaks. Neural shuffle-exchange networks–sequence processing in o (n log n) time.arXiv:1907.07897,
-
[28]
Johannes Gasteiger, Janek Groß, and Stephan Gün- nemann
Fabian B Fuchs, Daniel E Worrall, Volker Fischer, and Max Welling. SE(3)-transformers: 3D roto-translation equivariant attention networks. arXiv:2006.10503,
-
[29]
Learninggraphrepresentations with embedding propagation.arXiv:1710.03059,
AlbertoGarcía-DuránandMathiasNiepert. Learninggraphrepresentations with embedding propagation.arXiv:1710.03059,
-
[30]
Texture synthesis usingconvolutionalneuralnetworks
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. Texture synthesis usingconvolutionalneuralnetworks. arXivpreprintarXiv:1505.07376 ,2015. ThomasGaudelet,BenDay,ArianRJamasb,JyothishSoman,CristianRegep, Gertrude Liu, Jeremy BR Hayter, Richard Vickers, Charles Roberts, Jian Tang, et al. Utilising graph machine learning within drug discovery and develop...
-
[31]
Neural message passing for quantum chemistry
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. arXiv:1704.01212,
-
[32]
Generative Adversarial Networks
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks.arXiv:1406.2661,
work page internal anchor Pith review arXiv
-
[33]
arXiv preprint arXiv:1308.0850 (2013) 4, 5
Alex Graves. Generating sequences with recurrent neural networks. arXiv:1308.0850,
-
[34]
Alex Graves, Greg Wayne, and Ivo Danihelka. Neural turing machines. arXiv:1410.5401,
work page internal anchor Pith review arXiv
-
[35]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H Richemond,ElenaBuchatskaya,CarlDoersch,BernardoAvilaPires,Zhao- han Daniel Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent: A new approach to self-supervised learning.arXiv:2006.07733,
-
[36]
Deisy Morselli Gysi, Ítalo Do Valle, Marinka Zitnik, Asher Ameli, Xiao Gan, Onur Varol, Helia Sanchez, Rebecca Marlene Baron, Dina Ghiassian, Joseph Loscalzo, et al. Network medicine framework for identifying drug repurposing opportunities for COVID-19.arXiv:2004.07229,
-
[37]
Identity matters in deep learning
Moritz Hardt and Tengyu Ma. Identity matters in deep learning. arXiv:1611.04231,
-
[38]
Vain: Attentional multi-agent predictive modeling
Yedid Hoshen. Vain: Attentional multi-agent predictive modeling. arXiv:1706.06122,
-
[39]
LieTransformer: Equivariant self- attention for Lie groups.arXiv:2012.10885,
Michael Hutchinson, Charline Le Lan, Sheheryar Zaidi, Emilien Dupont, Yee Whye Teh, and Hyunjik Kim. LieTransformer: Equivariant self- attention for Lie groups.arXiv:2012.10885,
-
[40]
URLhttps: //doi.org/10.5281/zenodo.2526396. Sarah Itani and Dorina Thanou. Combining anatomical and functional net- worksforneuropathologyidentification: Acasestudyonautismspectrum disorder. Medical Image Analysis, 69:101986,
-
[41]
Šukasz Kaiser and Ilya Sutskever. Neural GPUs learn algorithms. arXiv:1511.08228,
-
[42]
Neural machine translation in linear time.arXiv:1610.10099,
Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, and Koray Kavukcuoglu. Neural machine translation in linear time.arXiv:1610.10099,
-
[43]
Dif- ferentiable graph module (DGM) graph convolutional networks
Anees Kazi, Luca Cosmo, Nassir Navab, and Michael Bronstein. Dif- ferentiable graph module (DGM) graph convolutional networks. arXiv:2002.04999,
-
[44]
Interpretable stability bounds for spectral graph filters.arXiv:2102.09587,
Henry Kenlay, Dorina Thanou, and Xiaowen Dong. Interpretable stability bounds for spectral graph filters.arXiv:2102.09587,
-
[45]
Adam: A Method for Stochastic Optimization
DiederikPKingmaandJimmyBa. Adam: Amethodforstochasticoptimiza- tion. arXiv:1412.6980,
work page internal anchor Pith review Pith/arXiv arXiv
-
[46]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv:1312.6114,
work page internal anchor Pith review Pith/arXiv arXiv
-
[47]
Semi-Supervised Classification with Graph Convolutional Networks
ThomasNKipfandMaxWelling. Semi-supervisedclassificationwithgraph convolutional networks.arXiv:1609.02907, 2016a. Thomas N Kipf and Max Welling. Variational graph auto-encoders. arXiv:1611.07308, 2016b. BIBLIOGRAPHY 141 Dmitry B Kireev. Chemnet: a novel neural network based method for graph/property mapping.J. Chemical Information and Computer Sciences, 35(...
work page internal anchor Pith review Pith/arXiv arXiv
-
[48]
Johannes Klicpera, Janek Groß, and Stephan Günnemann. Directional mes- sage passing for molecular graphs.arXiv:2003.03123,
-
[49]
Energyflownetworks: deep sets for particle jets.Journal of High Energy Physics, 2019(1):121,
PatrickTKomiske,EricMMetodiev,andJesseThaler. Energyflownetworks: deep sets for particle jets.Journal of High Energy Physics, 2019(1):121,
work page 2019
-
[50]
Neural random- access machines.arXiv:1511.06392,
Karol Kurach, Marcin Andrychowicz, and Ilya Sutskever. Neural random- access machines.arXiv:1511.06392,
-
[51]
Gated graph sequence neural networks.arXiv:1511.05493,
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated graph sequence neural networks.arXiv:1511.05493,
-
[52]
Andreas Madsen and Alexander Rosenberg Johansen. Neural arithmetic units. arXiv:2001.05016,
-
[53]
Soha Sadat Mahdi, Nele Nauwelaers, Philip Joris, Giorgos Bouritsas, Shun- wang Gong, Sergiy Bokhnyak, Susan Walsh, Mark Shriver, Michael Bronstein, and Peter Claes. 3d facial matching by spiral convolutional metric learning and a biometric fusion-net of demographic properties. arXiv:2009.04746,
-
[54]
Learn- ing representations of missing data for predicting patient outcomes
BIBLIOGRAPHY 143 Brandon Malone, Alberto Garcia-Duran, and Mathias Niepert. Learn- ing representations of missing data for predicting patient outcomes. arXiv:1811.04752,
-
[55]
Invariant and equivariant graph networks.arXiv:1812.09902,
HaggaiMaron,HeliBen-Hamu,NadavShamir,andYaronLipman. Invariant and equivariant graph networks.arXiv:1812.09902,
-
[56]
Prov- ably powerful graph networks.arXiv:1905.11136,
HaggaiMaron,HeliBen-Hamu,HadarServiansky,andYaronLipman. Prov- ably powerful graph networks.arXiv:1905.11136,
-
[57]
Scatteringnetworksonthesphereforscalableandrotationallyequivariant spherical cnns.arXiv:2102.02828,
Jason D McEwen, Christopher GR Wallis, and Augustine N Mavor-Parker. Scatteringnetworksonthesphereforscalableandrotationallyequivariant spherical cnns.arXiv:2102.02828,
-
[58]
Learning with invariances in random features and kernel models.arXiv:2102.13219,
Song Mei, Theodor Misiakiewicz, and Andrea Montanari. Learning with invariances in random features and kernel models.arXiv:2102.13219,
-
[59]
Representation learning via invariant causal mechanisms
JovanaMitrovic, BrianMcWilliams, JacobWalker, Lars Buesing, andCharles Blundell. Representation learning via invariant causal mechanisms. arXiv:2010.07922,
-
[60]
Fakenewsdetectiononsocialmediausinggeometric deep learning.arXiv:1902.06673,
Federico Monti, Fabrizio Frasca, Davide Eynard, Damon Mannion, and MichaelMBronstein. Fakenewsdetectiononsocialmediausinggeometric deep learning.arXiv:1902.06673,
-
[61]
Loopy belief propagation for approximate inference: An empirical study.arXiv:1301.6725,
BIBLIOGRAPHY 145 Kevin Murphy, Yair Weiss, and Michael I Jordan. Loopy belief propagation for approximate inference: An empirical study.arXiv:1301.6725,
-
[62]
Ryan L Murphy, Balasubramaniam Srinivasan, Vinayak Rao, and Bruno Ribeiro. Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs.arXiv:1811.01900,
-
[63]
Fourier-based and rational graph filters for spectral pro- cessing
Giuseppe Patanè. Fourier-based and rational graph filters for spectral pro- cessing. arXiv:2011.04055,
-
[64]
Learning mesh-based simulation with graph networks.arXiv preprint arXiv:2010.03409, 2020
Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter W Battaglia. Learning mesh-based simulation with graph networks. arXiv:2010.03409,
- [65]
-
[66]
Implicit regularization in deep learning may not be explainable by norms.arXiv:2005.06398,
Noam Razin and Nadav Cohen. Implicit regularization in deep learning may not be explainable by norms.arXiv:2005.06398,
-
[67]
Neural programmer-interpreters
Scott Reed and Nando De Freitas. Neural programmer-interpreters. arXiv:1511.06279,
-
[68]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r- cnn: Towards real-time object detection with region proposal networks. arXiv:1506.01497,
-
[69]
Temporal pointwise convolutional networks for length of stay prediction in the intensive care unit
Emma Rocheteau, Pietro Liò, and Stephanie Hyland. Temporal pointwise convolutional networks for length of stay prediction in the intensive care unit. arXiv:2007.09483,
-
[70]
Emma Rocheteau, Catherine Tong, Petar VeliŁković, Nicholas Lane, and PietroLiò.Predictingpatientoutcomeswithgraphrepresentationlearning. arXiv:2101.03940,
-
[71]
Temporal graph networks for deep learning on dynamic graphs,
Frank Rosenblatt. The perceptron: a probabilistic model for information storageandorganizationinthebrain. PsychologicalReview,65(6):386,1958. 148 BRONSTEIN, BRUNA, COHEN & VELIČKOVIfl EmanueleRossi,BenChamberlain,FabrizioFrasca,DavideEynard,Federico Monti,andMichaelBronstein. Temporalgraphnetworksfordeeplearning on dynamic graphs.arXiv:2006.10637,
-
[72]
Weight normalization: A sim- ple reparameterization to accelerate training of deep neural networks
Tim Salimans and Diederik P Kingma. Weight normalization: A sim- ple reparameterization to accelerate training of deep neural networks. arXiv:1602.07868,
-
[73]
Hamiltonian graph networks with ODE integrators.arXiv:1909.12790,
Alvaro Sanchez-Gonzalez, Victor Bapst, Kyle Cranmer, and Peter Battaglia. Hamiltonian graph networks with ODE integrators.arXiv:1909.12790,
-
[74]
Relational recurrent neural networks.arXiv:1806.01822,
Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, and Timothy Lillicrap. Relational recurrent neural networks.arXiv:1806.01822,
-
[75]
How does batch normalization help optimization?arXiv:1805.11604,
Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, and Aleksander Madry. How does batch normalization help optimization?arXiv:1805.11604,
-
[76]
Random features strengthen graph neural networks.arXiv:2002.03155,
Ryoma Sato, Makoto Yamada, and Hisashi Kashima. Random features strengthen graph neural networks.arXiv:2002.03155,
-
[77]
E(n) equivari- ant graph neural networks.arXiv:2102.09844,
BIBLIOGRAPHY 149 Victor Garcia Satorras, Emiel Hoogeboom, and Max Welling. E(n) equivari- ant graph neural networks.arXiv:2102.09844,
-
[78]
Proximal Policy Optimization Algorithms
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv:1707.06347,
work page internal anchor Pith review Pith/arXiv arXiv
-
[79]
Implicit regularization in relu networks with the square loss.arXiv:2012.05156,
Ohad Shamir and Gal Vardi. Implicit regularization in relu networks with the square loss.arXiv:2012.05156,
-
[80]
Very Deep Convolutional Networks for Large-Scale Image Recognition
KarenSimonyanandAndrewZisserman. Verydeepconvolutionalnetworks for large-scale image recognition.arXiv:1409.1556,
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.