pith. sign in

arxiv: 2606.28380 · v1 · pith:L7H324ANnew · submitted 2026-06-20 · 💻 cs.NE · cs.AI

Distilling a Modular Reservoir Through a Genomic Bottleneck

Pith reviewed 2026-06-30 11:29 UTC · model grok-4.3

classification 💻 cs.NE cs.AI
keywords hypernetworksmodular reservoir computingcurriculum meta-learningsparse recurrent networkstemporal tasksgenomic bottleneckconnectivity generation
0
0 comments X

The pith

Hypernetworks learn a compressed generative process that produces sparse modular reservoirs capable of solving difficult temporal tasks with minimal training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a hypernetwork, trained via curriculum-based meta-learning, can act as a generative model to decode connectivity for a modular reservoir from a compressed representation. This draws from the biological analogy of a genome guiding initial neural structure before experience-based refinement. If successful, the resulting sparse recurrent networks handle temporal processing tasks efficiently without needing extensive additional optimization or losing robustness. A sympathetic reader would care because the method offers a way to initialize structured networks that combine evolutionary-style compression with developmental plasticity.

Core claim

A hypernetwork trained through curriculum-based meta-learning can generate the connectivity of a modular reservoir from a compressed blueprint, yielding sparse recurrent networks that solve difficult temporal tasks with minimal training and without concessions to robustness.

What carries the argument

The hypernetwork as a compressed generative process that produces modular and sparse reservoir connectivity.

If this is right

  • The generated networks require only minimal training on new temporal tasks.
  • Sparsity and modularity in the produced connectivity preserve task performance and robustness.
  • Curriculum meta-learning enables the hypernetwork to scale the generative process across varying task difficulties.
  • The approach bridges compressed blueprint generation with subsequent plasticity for efficient recurrent computation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be extended to generate initial connectivity for other recurrent architectures beyond reservoirs.
  • If the hypernetwork generalizes across domains, it might reduce the need for task-specific architecture search in sequential data problems.
  • Testing the generated networks on real-world time series benchmarks would clarify practical utility beyond synthetic tasks.

Load-bearing premise

A hypernetwork trained via curriculum meta-learning can reliably produce functional modular and sparse connectivity that generalizes to difficult temporal tasks.

What would settle it

A direct test in which hypernetwork-generated reservoirs consistently fail to solve the target temporal tasks or require substantial further training to reach performance would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.28380 by Charley M. Wu, Emmanouil Giannakakis, Mani Hamidi, Sina Khajehabdollahi.

Figure 1
Figure 1. Figure 1 [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Implementing an indirect learning scheme using a g-net. Learning takes place at two levels: in the “evo￾lutionary” or genomic level, the hypernetworks (“g-nets”), are trained to generate just the inter-module weights of the RNN (the “p-net”). In the “lifelong learning” phase, the re￾maining input and output parameters of the p-net are further trained. b) following past (Khajehabdollahi et al., 2024; Hamidi… view at source ↗
Figure 3
Figure 3. Figure 3: Performance and parameter efficiency. A. Pa￾rameter count (|Θ|) scaling: directly trained parameters (solid) grow with N and M2 , while indirectly trained pa￾rameters (dashed) scale much more favorably. B. Learn￾ing efficiency (maximum N solved per trainable parame￾ter, Nmax |Θ| ) over training epochs. Solid lines = direct (both tasks), dashed = indirect (parity), dotted = indirect (DMS). The indirect adva… view at source ↗
Figure 4
Figure 4. Figure 4: A,B shows how hierarchical networks’ performance is affected by perturbations of their connectivity weights for both parity and DMS, where we measured the average accu￾racy with which the perturbed networks continued to solve the tasks. We perturbed the learnedWF m connections of both directly and indirectly trained networks that had learned to solve up to and including N = 40. Different magnitudes of pert… view at source ↗
Figure 5
Figure 5. Figure 5: Compressibility of WF across modules. Panels A–D show representative parity networks. A. Directly trained connections are fully uncorrelated between modules, while indirectly learned connections (B) are highly conserved. C. Directly trained weights change rapidly across modules; D. indirectly learned weights show minimal variation. E. SVD rank-90 (number of components for 90% reconstruction fidelity) for b… view at source ↗
Figure 6
Figure 6. Figure 6: Within-module compressibility of WF m. A. Hier￾archically clustered weight heatmaps at module m=20 for a representative parity network (dashed border = direct, solid = indirect). Indirectly learned weights exhibit more regu￾lar block structure. B. Within-module SVD rank-90 across depth, split by training method: Direct (left) and Indirect (right, note different y-scale). Solid lines = parity, dash￾dot = DM… view at source ↗
read the original abstract

The intricate structures of biological neural networks largely emerge during development, guided by a comparatively compressed blueprint encoded in the genome. The connectivity that emerges from this decoding process is rich in structure, and already equips the organism with functional modules upon birth. This initial structure serves as a scaffold that can be gradually refined and fine-tuned through lifelong experience, via a variety of plasticity mechanisms. Drawing inspiration from this interaction between evolutionary and developmental modes of learning, we use hypernetworks to learn a compressed generative process that generates the connectivity of a modular reservoir. We show that this marriage between curriculum-based meta-learning and modular reservoir computing can generate sparse recurrent networks that solve difficult temporal tasks with minimal training and without concessions to robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes using hypernetworks trained via curriculum-based meta-learning to learn a compressed generative process ('genomic bottleneck') that produces the connectivity of a modular reservoir. The central claim is that this combination generates sparse recurrent networks capable of solving difficult temporal tasks with minimal training and without concessions to robustness, drawing an analogy to biological development where a compressed genome guides initial structured connectivity that is later refined.

Significance. If the empirical claims hold with proper validation, the approach could provide a biologically inspired route to generating structured, sparse reservoirs that generalize efficiently, potentially advancing meta-learning applications in reservoir computing by reducing the need for extensive per-task training while preserving robustness properties.

major comments (2)
  1. [Abstract] Abstract: The central performance claim ('generate sparse recurrent networks that solve difficult temporal tasks with minimal training and without concessions to robustness') is asserted without any quantitative results, baselines, ablation studies, error bars, task descriptions, curriculum schedule, sparsity/modularity metrics, training budget comparisons, or robustness measures (e.g., noise tolerance). This makes it impossible to assess whether the generated connectivity transfers to held-out tasks while preserving the stated efficiency and robustness.
  2. [Abstract / Introduction] The assumption that the hypernetwork reliably produces functional modular/sparse connectivity that generalizes is stated as the core contribution but lacks any description of the temporal tasks, how the curriculum meta-learning schedule is constructed, or quantitative evidence of generalization beyond meta-training.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. We address the major points below and outline revisions to improve clarity and accessibility of the empirical claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central performance claim ('generate sparse recurrent networks that solve difficult temporal tasks with minimal training and without concessions to robustness') is asserted without any quantitative results, baselines, ablation studies, error bars, task descriptions, curriculum schedule, sparsity/modularity metrics, training budget comparisons, or robustness measures (e.g., noise tolerance). This makes it impossible to assess whether the generated connectivity transfers to held-out tasks while preserving the stated efficiency and robustness.

    Authors: We agree that the abstract, as a high-level summary, does not include the requested quantitative details, metrics, or evidence. The full manuscript reports these in the Experiments section, including task performance numbers, baseline comparisons, ablations on the genomic bottleneck, error bars, curriculum details, sparsity and modularity metrics, training budgets, and robustness measures such as noise tolerance. To address the concern directly, we will revise the abstract to incorporate key quantitative highlights supporting the central claim. revision: yes

  2. Referee: [Abstract / Introduction] The assumption that the hypernetwork reliably produces functional modular/sparse connectivity that generalizes is stated as the core contribution but lacks any description of the temporal tasks, how the curriculum meta-learning schedule is constructed, or quantitative evidence of generalization beyond meta-training.

    Authors: The manuscript describes the temporal tasks, curriculum meta-learning schedule construction, and quantitative generalization evidence in the Methods and Results sections. However, we acknowledge that the abstract and introduction do not sufficiently preview these elements to support the core claim upfront. We will add a concise overview of the tasks, schedule, and generalization metrics to the introduction. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual framework with no fitted predictions or self-referential reductions

full rationale

The provided abstract and description contain no equations, parameter fits, or derivation steps. The central claim is an empirical assertion that a hypernetwork trained via curriculum meta-learning generates functional modular reservoirs; this is presented as a demonstration rather than a mathematical reduction of outputs to inputs by construction. No self-citations, ansatzes, or uniqueness theorems are invoked in the given text. The derivation chain is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations or implementation details, so free parameters, axioms, and invented entities cannot be enumerated.

pith-pipeline@v0.9.1-grok · 5652 in / 939 out tokens · 34073 ms · 2026-06-30T11:29:31.856940+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

87 extracted references · 10 canonical work pages · 4 internal anchors

  1. [1]

    Duplication of modules facilitates the evolution of functional specialization

    Calabretta, R and Nolfi, S and Parisi, D and Wagner, G P. Duplication of modules facilitates the evolution of functional specialization. Artif. Life

  2. [2]

    2013 , publisher=

    From DNA to diversity: molecular genetics and the evolution of animal design , author=. 2013 , publisher=

  3. [3]

    Teacher-student compression with generative adversarial networks

    Liu, Ruishan and Fusi, Nicolo and Mackey, Lester. Teacher-student compression with generative adversarial networks. arXiv [cs.LG]

  4. [4]

    Teacher-class network: A neural network compression mechanism

    Malik, Shaiq Munir and Haider, Muhammad Umair and Tharani, Mohbat and Rasheed, Musab and Taj, Murtaza. Teacher-class network: A neural network compression mechanism. arXiv [cs.LG]

  5. [5]

    Superposition of many models into one

    Cheung, Brian and Terekhov, Alex and Chen, Yubei and Agrawal, Pulkit and Olshausen, Bruno. Superposition of many models into one. arXiv:1902.05522

  6. [6]

    Multi-Rate VAE : Train Once, Get the Full Rate-Distortion Curve

    Bae, Juhan and Zhang, Michael R and Ruan, Michael and Wang, Eric and Hasegawa, So and Ba, Jimmy and Grosse, Roger. Multi-Rate VAE : Train Once, Get the Full Rate-Distortion Curve. arXiv [cs.LG]

  7. [7]

    arXiv preprint arXiv:2502.20237 , year=

    Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks , author=. arXiv preprint arXiv:2502.20237 , year=

  8. [8]

    Advances in neural information processing systems , volume=

    Generalization in reinforcement learning with selective noise injection and information bottleneck , author=. Advances in neural information processing systems , volume=

  9. [9]

    ALIFE 2024: Proceedings of the 2024 Artificial Life Conference , year=

    Meta-Learning an Evolvable Developmental Encoding , author=. ALIFE 2024: Proceedings of the 2024 Artificial Life Conference , year=

  10. [10]

    Proceedings of the Genetic and Evolutionary Computation Conference , pages=

    Structurally Flexible Neural Networks: Evolving the Building Blocks for General Agents , author=. Proceedings of the Genetic and Evolutionary Computation Conference , pages=

  11. [11]

    Proceedings of the Genetic and Evolutionary Computation Conference Companion , pages=

    Growing artificial neural networks for control: the role of neuronal diversity , author=. Proceedings of the Genetic and Evolutionary Computation Conference Companion , pages=

  12. [12]

    Artificial Life Conference Proceedings 35 , volume=

    Towards self-assembling artificial neural networks through neural developmental programs , author=. Artificial Life Conference Proceedings 35 , volume=. 2023 , organization=

  13. [13]

    2023 , month =

    Najarro, Elias and Sudhakaran, Shyam and Risi, Sebastian , title = ". 2023 , month =. doi:10.1162/isal_a_00697 , url =

  14. [14]

    Recent advances in physical reservoir computing: A review

    Tanaka, Gouhei and Yamane, Toshiyuki and Héroux, Jean Benoit and Nakane, Ryosho and Kanazawa, Naoki and Takeda, Seiji and Numata, Hidetoshi and Nakano, Daiju and Hirose, Akira. Recent advances in physical reservoir computing: A review. Neural Netw

  15. [15]

    Development , volume=

    Understanding axon guidance: are we nearly there yet? , author=. Development , volume=. 2018 , publisher=

  16. [16]

    On the existence of information bottlenecks in living and non-living systems

    Crosscombe, Michael and Sato, Hiroki. On the existence of information bottlenecks in living and non-living systems. The 2023 Conference on Artificial Life

  17. [17]

    The basal ganglia over 500 million years

    Grillner, Sten and Robertson, Brita. The basal ganglia over 500 million years. Curr. Biol

  18. [18]

    Evolutionary conservation of the basal ganglia as a common vertebrate mechanism for action selection

    Stephenson-Jones, Marcus and Samuelsson, Ebba and Ericsson, Jesper and Robertson, Brita and Grillner, Sten. Evolutionary conservation of the basal ganglia as a common vertebrate mechanism for action selection. Curr. Biol

  19. [19]

    Resynthesizing behavior through phylogenetic refinement

    Cisek, Paul. Resynthesizing behavior through phylogenetic refinement. Atten. Percept. Psychophys

  20. [20]

    Meta-learning by the Baldwin effect

    Fernando, Chrisantha Thomas and Sygnowski, Jakub and Osindero, Simon and Wang, Jane and Schaul, Tom and Teplyashin, Denis and Sprechmann, Pablo and Pritzel, Alexander and Rusu, Andrei A. Meta-learning by the Baldwin effect. arXiv [cs.NE]

  21. [21]

    Novelty and imitation within the brain: a Darwinian neurodynamic approach to combinatorial problems

    Czégel, Dániel and Giaffar, Hamza and Csillag, Márton and Futó, Bálint and Szathmáry, Eörs. Novelty and imitation within the brain: a Darwinian neurodynamic approach to combinatorial problems. Sci. Rep

  22. [22]

    bioRxiv , year=

    A cortical information bottleneck during decision-making , author=. bioRxiv , year=

  23. [23]

    Elife , volume=

    Circuits for integrating learned and innate valences in the insect brain , author=. Elife , volume=. 2021 , publisher=

  24. [24]

    Computer science review , volume=

    Reservoir computing approaches to recurrent neural network training , author=. Computer science review , volume=. 2009 , publisher=

  25. [25]

    Deep Reservoir Computing

    Gallicchio, Claudio and Micheli, Alessio. Deep Reservoir Computing. Reservoir Computing: Theory, Physical Implementations, and Applications. 2021. doi:10.1007/978-981-13-1687-6_4

  26. [26]

    A role for relaxed selection in the evolution of the language capacity

    Deacon, Terrence W. A role for relaxed selection in the evolution of the language capacity. Proc. Natl. Acad. Sci. U. S. A

  27. [27]

    Science Advances , volume=

    Inductive biases of neural network modularity in spatial navigation , author=. Science Advances , volume=. 2024 , publisher=

  28. [28]

    Curiosity driven exploration of learned disentangled goal spaces

    Laversanne-Finot, Adrien and Péré, Alexandre and Oudeyer, Pierre-Yves. Curiosity driven exploration of learned disentangled goal spaces. arXiv [cs.LG]

  29. [29]

    Representation learning in deep RL via discrete information bottleneck

    Islam, Riashat and Zang, Hongyu and Tomar, Manan and Didolkar, Aniket and Islam, Md Mofijul and Arnob, Samin Yeasar and Iqbal, Tariq and Li, Xin and Goyal, Anirudh and Heess, Nicolas and Lamb, Alex. Representation learning in deep RL via discrete information bottleneck. arXiv [cs.LG]

  30. [30]

    Annual review of biochemistry , volume=

    Molecular mechanisms of mammalian DNA repair and the DNA damage checkpoints , author=. Annual review of biochemistry , volume=. 2004 , publisher=

  31. [31]

    BioSystems , volume=

    Error-correcting codes and information in biology , author=. BioSystems , volume=. 2019 , publisher=

  32. [32]

    arXiv preprint arXiv:2001.08028 , year=

    Natural selection finds natural gradient , author=. arXiv preprint arXiv:2001.08028 , year=

  33. [33]

    Evolutionary Optimization of Model Merging Recipes

    Akiba, Takuya and Shing, Makoto and Tang, Yujin and Sun, Qi and Ha, David. Evolutionary Optimization of Model Merging Recipes. arXiv [cs.NE]

  34. [34]

    Nature Machine Intelligence , volume=

    Designing neural networks through neuroevolution , author=. Nature Machine Intelligence , volume=. 2019 , publisher=

  35. [35]

    Drop-bottleneck: Learning discrete compressed representation for noise-robust exploration

    Kim, Jaekyeom and Kim, Minjung and Woo, Dongyeon and Kim, Gunhee. Drop-bottleneck: Learning discrete compressed representation for noise-robust exploration. arXiv [cs.LG]

  36. [36]

    Measuring compositionality in representation learning

    Andreas, Jacob. Measuring compositionality in representation learning. Int Conf Learn Represent

  37. [37]

    Journal of Petroleum Science and Engineering , volume=

    A fast and independent architecture of artificial neural network for permeability prediction , author=. Journal of Petroleum Science and Engineering , volume=. 2012 , publisher=

  38. [38]

    Network Neuroscience , volume=

    Optimal modularity and memory capacity of neural reservoirs , author=. Network Neuroscience , volume=. 2019 , publisher=

  39. [39]

    Journal of Comparative Neurology , volume=

    The modular organization of the cerebral cortex: Evolutionary significance and possible links to neurodevelopmental conditions , author=. Journal of Comparative Neurology , volume=. 2019 , publisher=

  40. [40]

    Proceedings of the National Academy of Sciences , volume=

    The modular and integrative functional architecture of the human brain , author=. Proceedings of the National Academy of Sciences , volume=. 2015 , publisher=

  41. [41]

    Neural networks , volume=

    Design and evolution of modular neural network architectures , author=. Neural networks , volume=. 1994 , publisher=

  42. [42]

    The Twelfth International Conference on Learning Representations , year=

    Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks , author=. The Twelfth International Conference on Learning Representations , year=

  43. [43]

    Frontiers in neuroscience , volume=

    Modular and hierarchically modular organization of brain networks , author=. Frontiers in neuroscience , volume=. 2010 , publisher=

  44. [44]

    2023 , eprint=

    Principled Weight Initialization for Hypernetworks , author=. 2023 , eprint=

  45. [45]

    Breaking neural network scaling laws with modularity

    Boopathy, Akhilan and Jiang, Sunshine and Yue, William and Hwang, Jaedong and Iyer, Abhiram and Fiete, Ila. Breaking neural network scaling laws with modularity. arXiv [cs.LG]

  46. [46]

    Don't cut corners: Exact conditions for modularity in biologically inspired representations

    Dorrell, Will and Hsu, Kyle and Hollingsworth, Luke and Lee, Jin Hwa and Wu, Jiajun and Finn, Chelsea and Latham, Peter E and Behrens, Tim E J and Whittington, James C R. Don't cut corners: Exact conditions for modularity in biologically inspired representations. arXiv [q-bio.NC]

  47. [47]

    Inductive biases of neural network modularity in spatial navigation

    Zhang, Ruiyi and Pitkow, Xaq and Angelaki, Dora E. Inductive biases of neural network modularity in spatial navigation. Sci. Adv

  48. [48]

    Nature communications , volume=

    A critique of pure learning and what artificial neural networks can learn from animal brains , author=. Nature communications , volume=. 2019 , publisher=

  49. [49]

    Proceedings of the National Academy of Sciences , volume=

    Encoding innate ability through a genomic bottleneck , author=. Proceedings of the National Academy of Sciences , volume=. 2024 , publisher=

  50. [50]

    HyperNetworks

    Ha, David and Dai, Andrew and Le, Quoc V. HyperNetworks. arXiv [cs.LG]

  51. [51]

    Programmed and self-organized flow of information during morphogenesis

    Collinet, Claudio and Lecuit, Thomas. Programmed and self-organized flow of information during morphogenesis. Nat. Rev. Mol. Cell Biol

  52. [52]

    The Genomic Code: the genome instantiates a generative model of the organism

    Mitchell, Kevin J and Cheney, Nick. The Genomic Code: the genome instantiates a generative model of the organism. Trends Genet

  53. [53]

    Genetic programming and evolvable machines , volume=

    Compositional pattern producing networks: A novel abstraction of development , author=. Genetic programming and evolvable machines , volume=. 2007 , publisher=

  54. [54]

    ALIFE 2024: Proceedings of the 2024 Artificial Life Conference , year=

    Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning , author=. ALIFE 2024: Proceedings of the 2024 Artificial Life Conference , year=

  55. [55]

    arXiv preprint arXiv:2406.09020 , year=

    Meta-Learning an Evolvable Developmental Encoding , author=. arXiv preprint arXiv:2406.09020 , year=

  56. [56]

    A tale of two algorithms: Structured slots explain prefrontal sequence memory and are unified with hippocampal cognitive maps

    Whittington, James C R and Dorrell, William and Behrens, Timothy E J and Ganguli, Surya and El-Gaby, Mohamady. A tale of two algorithms: Structured slots explain prefrontal sequence memory and are unified with hippocampal cognitive maps. Neuron

  57. [57]

    Minimum Description Length recurrent neural networks

    Lan, Nur and Geyer, Michal and Chemla, Emmanuel and Katzir, Roni. Minimum Description Length recurrent neural networks. arXiv [cs.CL]

  58. [58]

    Artificial Intelligence Review , volume=

    A brief review of hypernetworks in deep learning , author=. Artificial Intelligence Review , volume=. 2024 , publisher=

  59. [59]

    Artificial life , volume=

    An enhanced hypercube-based encoding for evolving the placement, density, and connectivity of neurons , author=. Artificial life , volume=. 2012 , publisher=

  60. [60]

    The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

    Saxe, Andrew M and Sodhani, Shagun and Lewallen, Sam. The Neural Race Reduction: Dynamics of Abstraction in Gated Networks. arXiv [cs.LG]

  61. [61]

    Proving the Lottery Ticket Hypothesis: Pruning is All You Need

    Malach, Eran and Yehudai, Gilad and Shalev-Schwartz, Shai and Shamir, Ohad. Proving the Lottery Ticket Hypothesis: Pruning is All You Need. International Conference on Machine Learning

  62. [62]

    Failures of gradient-based Deep Learning

    Shalev-Shwartz, Shai and Shamir, Ohad and Shammah, Shaked. Failures of gradient-based Deep Learning. arXiv [cs.LG]

  63. [63]

    The lottery ticket hypothesis: Finding sparse, trainable neural networks

    Frankle, Jonathan and Carbin, Michael. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv [cs.LG]

  64. [64]

    Multiplicative interactions and where to find them

    Jayakumar, Siddhant M and Menick, Jacob and Czarnecki, Wojciech M and Schwarz, Jonathan and Rae, Jack W and Osindero, Simon and Teh, Y and Harley, Tim and Pascanu, Razvan. Multiplicative interactions and where to find them. Int Conf Learn Represent

  65. [65]

    The genesis and evolution of homeobox gene clusters

    Garcia-Fernàndez, Jordi. The genesis and evolution of homeobox gene clusters. Nat. Rev. Genet

  66. [66]

    Designing neural networks through neuroevolution

    Stanley, Kenneth O and Clune, Jeff and Lehman, Joel and Miikkulainen, Risto. Designing neural networks through neuroevolution. Nature Machine Intelligence

  67. [67]

    Complex computation from developmental priors

    Barabási, Dániel L and Beynon, Taliesin and Katona, \'A dam and Perez-Nieves, Nicolas. Complex computation from developmental priors. Nat. Commun

  68. [68]

    2018 , eprint=

    From Nodes to Networks: Evolving Recurrent Neural Networks , author=. 2018 , eprint=

  69. [69]

    Self-Tuning Networks: Bilevel optimization of hyperparameters using structured best-response functions

    MacKay, Matthew and Vicol, Paul and Lorraine, Jon and Duvenaud, David and Grosse, Roger. Self-Tuning Networks: Bilevel optimization of hyperparameters using structured best-response functions. arXiv [cs.LG]

  70. [70]

    Modular growth of hierarchical networks: Efficient, general, and robust curriculum learning

    Hamidi, Mani and Khajehabdollahi, Sina and Giannakakis, Emmanouil and Schäfer, Tim J and Levina, Anna and Wu, Charley M. Modular growth of hierarchical networks: Efficient, general, and robust curriculum learning. The 2024 Conference on Artificial Life

  71. [71]

    Stochastic Hyperparameter Optimization through Hypernetworks

    Lorraine, Jonathan and Duvenaud, David. Stochastic Hyperparameter Optimization through Hypernetworks. arXiv [cs.LG]

  72. [72]

    Proceedings of the Royal Society b: Biological sciences , volume=

    The evolutionary origins of modularity , author=. Proceedings of the Royal Society b: Biological sciences , volume=. 2013 , publisher=

  73. [73]

    Fourier features let networks learn high frequency functions in low dimensional domains

    Tancik, Matthew and Srinivasan, Pratul P and Mildenhall, Ben and Fridovich-Keil, Sara and Raghavan, Nithin and Singhal, Utkarsh and Ramamoorthi, Ravi and Barron, Jonathan T and Ng, Ren. Fourier features let networks learn high frequency functions in low dimensional domains. arXiv [cs.CV]

  74. [74]

    International conference on machine learning , pages=

    Model-agnostic meta-learning for fast adaptation of deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

  75. [75]

    DARTS: Differentiable Architecture Search

    Darts: Differentiable architecture search , author=. arXiv preprint arXiv:1806.09055 , year=

  76. [76]

    Advances in Neural Information Processing Systems , volume=

    Meta architecture search , author=. Advances in Neural Information Processing Systems , volume=

  77. [77]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Meta-learning of neural architectures for few-shot learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  78. [78]

    arXiv:2110.09165 [cond-mat, physics:nlin, q-bio] , author =

    A reservoir of timescales in random neural networks , url =. arXiv:2110.09165 [cond-mat, physics:nlin, q-bio] , author =. 2021 , note =

  79. [79]

    Training Compute-Optimal Large Language Models

    Training compute-optimal large language models , author=. arXiv preprint arXiv:2203.15556 , year=

  80. [80]

    , author=

    Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=

Showing first 80 references.