pith.machine review for the scientific record.sign in
cs.NE
Neural and Evolutionary Computing
Covers neural networks, connectionism, genetic algorithms, artificial life, adaptive behavior. Roughly includes some material in ACM Subject Class C.1.3, I.2.6, I.5.
Spiking Neural Networks (SNNs) are a promising framework for event-driven temporal processing. Prior work has improved temporal modeling through richer neuron dynamics and network-level mechanisms such as recurrence and delays, but it remains unclear how individual spiking neurons should specialize within a network. In this work, we introduce FiTS, a spiking neuron that factorizes temporal computation within each neuron into Frequency Selectivity (FS) and Temporal Shaping (TS). The FS module parameterizes each neuron's target frequency as the maximizer of its subthreshold magnitude response, while the TS module reshapes when frequency components contribute to membrane voltage accumulation through group-delay modulation. On auditory benchmarks where frequency selectivity and timing are central to the input structure, FiTS consistently improves over a plain Leaky Integrate-and-Fire (LIF) baseline in simple feedforward SNNs without recurrence or network-level delays, while remaining competitive with strong temporal SNN baselines. Beyond accuracy, the learned target frequencies and group-delay shifts provide interpretable neuron-level summaries of the frequency and timing organization learned within the network.
The numerical optimization of continuous functions is a fundamental task in many scientific and engineering domains, ranging from mechanical design to training of artificial intelligence models. Among the most effective and widely used algorithms for this purpose is Differential Evolution (DE), known for its simplicity and strong performance. Recent research has shown that adapting AI models to operate over alternative number systems-such as complex numbers, quaternions, and geometric algebras-can improve model compactness and accuracy. However, such extensions remain underexplored in bio-inspired optimization algorithms. In particular, the use of quaternion algebra represents an emerging area in computational intelligence. This paper introduces a family of novel Quaternion-Valued Differential Evolution (QDE) algorithms that operate directly in the quaternion space. We propose several mutation strategies specifically designed to exploit the algebraic and geometric properties of quaternions. Results show that our QDE variants achieve faster convergence and superior performance on several function classes in the BBOB benchmark compared to the traditional real-valued DE algorithm.
Model merging has emerged as a cost-effective alternative to training large language models (LLMs) from scratch, enabling researchers to combine pre-trained models into more capable systems without full retraining. Evolutionary approaches to model merging have shown particular promise, automatically searching for optimal merging configurations across both parameter space (PS) and data flow space (DFS). However, the optimization challenges underlying these approaches -- particularly in DFS merging -- remain poorly understood and formally underspecified in the literature. This paper makes two contributions. First, we provide a structured survey of evolutionary model merging techniques, organizing them into three categories: parameter-space merging, data flow space merging, and hybrid approaches. Second, we formally characterize the DFS merging problem as a black-box optimization problem involving mixed binary-continuous variables, high-dimensional search spaces, and conditional dependencies between variable types -- challenges that standard optimization methods such as CMA-ES are not designed to handle. We provide preliminary empirical validation using real pre-trained language models, demonstrating that a structured approach respecting the binary-continuous conditional dependency outperforms an unstructured approach by 6.7% accuracy while reducing the effective search space by 51.4%. By connecting the model merging community with the broader evolutionary computation and black-box optimization literature, we identify concrete open problems and propose research directions to address them.
Spiking neural networks (SNNs) promise low-power event-driven computation for temporally rich tasks, but commonly used neuron models often trade off gradient-based trainability, dynamical richness, and high activity sparsity. These limitations are acute in regression, where approximation error, noise and spike discretization can severely degrade continuous-valued outputs. Indeed, many state-of-the-art (SOTA) SNNs rely on simple phenomenological dynamics trained with surrogate gradients and offer limited control over spiking diversity and sparsity. To overcome such limitations, we introduce multi-timescale conductance spiking networks, a gradient-trainable framework in which neural dynamics emerge from shaping the current-voltage (I-V) curve by tuning fast, slow and ultra-slow conductances. This parametrization allows systematic control over excitability, can be implemented efficiently in analog circuits, and yields rich firing regimes including tonic, phasic and bursting responses within a single model. We derive a discrete-time formulation of these differentiable dynamics, enabling direct backpropagation through time without surrogate-gradient approximations. To probe both trainability and accuracy, we evaluate feedforward networks of these neurons at the predictability limit of Mackey-Glass time-series regression and compare them to baseline LIF and SOTA AdLIF networks. Our model outperforms LIF and AdLIF networks, while exhibiting substantially sparser activity from both communication and computational perspectives. These results highlight multi-timescale conductance spiking neurons as a promising building block for energy-aware temporal processing and neuromorphic implementation.
Short-term plasticity (STP) is fundamental to temporal information processing in biological neural systems but remains difficult to realize efficiently in neuromorphic hardware. Memristive electrochemical random-access memory (ECRAM) devices naturally exhibit non-equilibrium ionic dynamics that produce transient conductance modulation; however, these behaviors are typically treated as undesirable variability or tolerated as side effects in memory-centric computing paradigms. In this work, we instead transform these volatile dynamics from a tolerated device artifact into a computational resource through a cross-layer device-circuit-system co-design framework. We introduce a delay-feedback leaky integrate-and-fire (LIF) neuron architecture co-designed with ECRAM synapses that exploits activity-dependent conductance modulation with negligible additional circuit overhead. The architecture integrates ECRAM-based synapses with a tunable delay-feedback spike-generation path, enabling transient device dynamics to directly modulate neuron excitability and synaptic efficacy. We used experimentally characterized ECRAM devices exhibiting transient conductance modulation (1.5 KOhms per spike) to develop a compact behavioral model suitable for circuit-level simulation. Circuit simulations demonstrate two key STP behaviors -- synaptic facilitation and intrinsic excitability modulation -- while consuming 2 pJ per spike, and the same device-driven mechanisms extend across multiple neuron topologies. Network-level analysis further demonstrates frequency-selective spike processing, allowing individual synapses to act as tunable temporal filters within spiking neural networks. This work demonstrates that non-equilibrium ECRAM dynamics can serve as a native hardware substrate for STP and temporal computation in neuromorphic circuits.
NSGA-III with crossover optimizes m-OJZJ asymptotically faster than without it for any number of objectives.
abstractclick to expand
In recent years, a theoretical understanding has rapidly advanced regarding how popular multi-objective evolutionary algorithms (MOEAs) can optimize many-objective problems. However, the benefits of using crossover in many-objective optimization are theoretically not understood, except for specifically designed benchmark functions tuned to particular crossover operators, and still lag significantly behind its practical use. In this paper, we build upon this line of research and present a theoretical runtime analysis of the widely used NSGA-III algorithm on the classical $m$-objective $m$-OneJumpZeroJump function ($m$-OJZJ for short). Our results demonstrate that NSGA-III with crossover optimizes $m$-OJZJ asymptotically faster than NSGA-III without crossover for any number $m$ of objectives for huge parameter regimes. We complement our analysis by providing a lower runtime bound on $4$-OJZJ when crossover is turned off.
Spiking Neural Networks (SNNs) can reduce energy consumption compared to conventional Artificial Neural Networks (ANNs) when spiking activity is sparse and the neuron model is hardware-friendly. However, biologically faithful models are often too costly to implement on FPGAs, whereas very simple models (e.g., IR/LIF) sacrifice part of the neuronal dynamics. In this work, we present an FPGA accelerator for an SNN using Spiking Recurrent Cell (SRC) neurons, providing a trade-off between biological plausibility and hardware cost. We propose a set of mathematical simplifications that remove costly unary operators (\textit{tanh}, \textit{exp}) and avoid floating-point arithmetic through scaling and piecewise-defined approximations. The complete network is implemented in VHDL and validated using spiking traces derived from the MNIST dataset. The weight matrices computed off-line are stored directly in LUT-registers without any adaptation. This demonstrates the robustness of SRC cells. Experiments were conducted on an Artix-7 XC7A200T clocked at 100 MHz. The reference implementation achieves 96.31\% accuracy with a 220-image spiking trace and a processing time of 1.7424 ms per digit. We then investigate accuracy/energy trade-offs by reducing the spiking trace length and quantizing synaptic weights down to 4 bits, achieving 93.32\% accuracy at 0.55 mJ per digit (55 images, 5-bit weights) and 92.89\% at 0.45 mJ (44 images, 4-bit weights). These results show that SRC-based SNNs can deliver competitive performance with reduced energy consumption, while preserving richer neuronal dynamics than standard LIF/IR models.
Spiking Neural Networks (SNNs) can reduce energy consumption compared to conventional Artificial Neural Networks (ANNs) when spiking activity is sparse and the neuron model is hardware-friendly. However, biologically faithful models are often too costly to implement on FPGAs, whereas very simple models (e.g., IR/LIF) sacrifice part of the neuronal dynamics. In this work, we present an FPGA accelerator for an SNN using Spiking Recurrent Cell (SRC) neurons, providing a trade-off between biological plausibility and hardware cost. We propose a set of mathematical simplifications that remove costly unary operators (\textit{tanh}, \textit{exp}) and avoid floating-point arithmetic through scaling and piecewise-defined approximations. The complete network is implemented in VHDL and validated using spiking traces derived from the MNIST dataset. The weight matrices computed off-line are stored directly in LUT-registers without any adaptation. This demonstrates the robustness of SRC cells. Experiments were conducted on an Artix-7 XC7A200T clocked at 100 MHz. The reference implementation achieves 96.31\% accuracy with a 220-image spiking trace and a processing time of 1.7424 ms per digit. We then investigate accuracy/energy trade-offs by reducing the spiking trace length and quantizing synaptic weights down to 4 bits, achieving 93.32\% accuracy at 0.55 mJ per digit (55 images, 5-bit weights) and 92.89\% at 0.45 mJ (44 images, 4-bit weights). These results show that SRC-based SNNs can deliver competitive performance with reduced energy consumption, while preserving richer neuronal dynamics than standard LIF/IR models.
We propose a game-theoretic framework for adaptive multi-agent intelligent systems. Unlike classical game theory, which often treats strategies as primitive objects chosen by perfectly rational agents, the proposed framework provides a mathematical foundation for studying equilibrium in NeuroAI and can be viewed as an extension of game theory under relaxed assumptions, including partial observability, bounded computation, and uncertainty. At its core, Multilevel Interactive Equilibrium (MIE) generalizes the classical Nash equilibrium to intelligent systems with internal computation. Rather than being defined solely at the level of observable behavior, equilibrium emerges when neural learning dynamics, cognitive representations, and behavioral strategies mutually stabilize between interacting agents. This framework applies uniformly to interactions between two biological brains, two artificial agents, or hybrid human-AI systems. We discuss applications of multilevel game theory to human-autonomous vehicle driving, human-machine interaction, human-large language model (LLM) interaction, and computational psychiatry. We also outline experimental strategies and computational methods for estimating MIE and discuss challenges and prospects for future research.
Existing Meta-Black-Box Optimization (MetaBBO) methods focus on how to search when controlling optimizers, but largely overlook where to search. We propose MetaSG-SAEA, a bi-level MetaBBO framework for expensive constrained multi-objective optimization problems (ECMOPs), in which a meta-policy provides search guidance to the low-level Surrogate-Assisted Evolutionary Algorithm (SAEA). To achieve this, we introduce Max-Min Constraint-Calibrated Inequality (MM-CCI), a compact, problem-agnostic region abstraction that maps heterogeneous constraint evaluations to an ordered scalar level; we further provide a theoretical analysis of its fundamental properties. Building on this region abstraction, we adopt diffusion-based population initialization to translate the meta-policy's region-level guidance into solution-level priors for the SAEA. To make MetaSG-SAEA scalable, we construct an attention-based state representation across varying problem dimensions, population sizes, and numbers of objectives and constraints. Experimental results demonstrate that MetaSG-SAEA outperforms state-of-the-art baselines across diverse benchmarks and exhibits the ability to generalize across problem distributions.
Millimeter-wave (mmWave) sensing enables privacy-preserving, always-on edge perception, but its measurements are often sparse, temporally irregular, and corrupted by high-frequency noise. Existing mmWave pipelines predominantly rely on artificial neural networks (ANNs), which achieve robustness through extensive preprocessing or deep architectures, thereby limiting their efficiency on edge devices. In this work, we study spiking neural networks (SNNs) for mmWave sensing from a mechanism-data alignment perspective. By leveraging the low-pass filtering behavior of leaky integrate-and-fire (LIF) dynamics, we analyze how their implicit temporal filtering interacts with the frequency structure of mmWave signals. Our analysis shows that when discriminative information resides in low-to-mid frequencies, LIF dynamics can inherently suppress high-frequency noise, clarifying when and why SNNs outperform ANNs. Based on this insight, we derive a principled criterion for configuring the membrane decay factor by matching the effective bandwidth of LIF dynamics to the data's discriminative spectral content. Experimental results across four widely used mmWave datasets validate the proposed frequency-matching hypothesis, yielding an average test-accuracy improvement of 6.22% and a 3.64$\times$ reduction in theoretical energy consumption relative to ANN baselines, under a unified evaluation protocol.
Large Language Models exhibit mode collapse, producing homogeneous outputs that fail to explore valid solution spaces. We present QD-LLM, a framework for parameter-efficient neuroevolution that evolves prompt embeddings, compact neural interfaces (~32K parameters) that steer generation in frozen LLMs (70B+ parameters), within a Quality-Diversity (QD) optimization framework. Our contributions: (1) evolved prompt embeddings via gradient-free optimization enabling behavioral steering without model fine-tuning; (2) hybrid behavior characterization combining semantic and explicit features with formal coverage bounds (Theorem 1) under validated near-independence (NMI $= 0.08 \pm 0.02$); (3) co-evolutionary variation operators including targeted behavioral mutation via finite-difference gradient estimation. On HumanEval (164 problems), MBPP, and creative writing benchmarks, QD-LLM achieves 46.4% higher coverage and 41.4% higher QD-Score than QDAIF ($p<0.001$, 30 runs, Vargha-Delaney $A=0.94$). We demonstrate downstream utility: diverse archives improve test generation (34% more edge cases) and fine-tuning data quality (8.3% accuracy gain). We validate across open-source LLMs (Llama-3-70B, Mistral-Large) with full embedding access, establishing prompt embedding evolution as an effective paradigm bridging neuroevolution and modern LLMs.
Population methods raise preference coverage 18 percent and cut collapse 47 percent versus gradient descent at matched quality.
abstractclick to expand
Gradient-based preference optimization methods for large language model (LLM) alignment suffer from preference collapse, converging to narrow behavioral modes while neglecting preference diversity. We introduce EvoPref, a multi-objective evolutionary algorithm that maintains populations of Low-Rank Adaptation (LoRA) adapters optimized across helpfulness, harmlessness, and honesty objectives using Non-dominated Sorting Genetic Algorithm II (NSGA-II) selection with archive-based diversity preservation.
Our primary contribution is demonstrating that population-based methods discover substantially more diverse alignments than gradient descent. On standard benchmarks, EvoPref improves preference coverage by 18% (median 82.5% vs. 70.0% for ORPO, $p<0.001$, Wilcoxon, $n=30$) and reduces collapse rates by 47% (11.0% vs. 20.6%, $p<0.001$), while achieving competitive alignment quality (median 75.5% RewardBench vs. 75.0% for ORPO, $p<0.05$). We provide theoretical motivation extending recent multi-objective evolutionary algorithm (MOEA) runtime analysis (Dang et al., 2025) suggesting why archive-based methods escape collapse more effectively than single-trajectory optimization.
Comprehensive comparisons against MOEA/D, SMS-EMOA, CMA-ES, and gradient baselines (DPO, IPO, KTO, ORPO) with rigorous statistical testing (Friedman with Holm correction, Vargha-Delaney effect sizes, median with IQR) confirm that multi-objective selection with diversity preservation is essential. This work establishes evolutionary optimization as a principled paradigm for diverse LLM alignment.
Spike-based encodings are sparse and energy-efficient, but have largely been formulated probabilistically, disconnected from most signal processing literature. We recast spike encoders as time-causal wavelet frames with quantitative bandwidths and reconstruction error bounds. The proposed wavelets preserve the sparsity and locality of spiking representations, with reconstruction up to spike quantization and time discretization. We demonstrate reconstruction on ECG and audio datasets, achieving a normalized RMSE comparable to continuous wavelet transforms. The spiking wavelets map directly to neuromorphic hardware.
LLM-guided evolutionary methods such as AlphaEvolve have proven effective in domains like math, systems research, and algorithmic discovery, but their reliance on frontier models makes each run expensive. We argue this is largely an artifact of how existing frameworks allocate search: archives that fail to preserve solution diversity force compensation through stronger mutation models; blind model use spends frontier dollars on local edits a smaller model could handle; and full-set evaluation wastes rollouts on redundant examples. We introduce LEVI, a harness-first evolutionary framework built on the bet that stronger search architectures can substitute for or even outperform larger LLMs in evolutionary search. LEVI improves on three core components of evolutionary search: a solution database that establishes diversity from the beginning, and then maintains it throughout the run; a smarter mutation router that plays into the strengths of large and small LLMs; and a rank-preserving proxy benchmark for rollout-heavy settings. Across systems-research benchmarks LEVI attains the highest score on a budget 3.3-6.7x smaller than the published frontier-model runs of existing frameworks like ShinkaEvolve, GEPA, and AdaEvolve; on one problem, LEVI matches the existing best at a 35x lower cost. On prompt optimization, LEVI matches or exceeds GEPA at less than half of its rollout budget on four different benchmarks. LEVI is available as an open-source framework at https://github.com/ttanv/levi.
Cauchy sampling, feasible archive, and stagnation kick after 180 generations match quality of prior methods while speeding convergence on 30
abstractclick to expand
We extend RDEx-CSOP with 3 changes that target stagnation & late-stage variance, plus minor parameter tuning. The second scale factor in the standard branch is sampled independently from a truncated Cauchy. A small feasible-only JADE-style archive (|A|_max = 50) is added & sampled with probability |A|/(|A|+|P|). Per-individual stagnation counter triggers, after 180 no-improvement generations, three local overrides on standard branch: pull toward the global best, lift the archive sampling floor to 0.65, & saturate CR to 0.95 when population success rate is below 0.10. The exploitation biased branch & every other RDEx component are left untouched. On CEC CSOP suite (D=30, 25 runs), RDEx-CASK is competitive with RDEx, UDE-III, & CL-SRDE in feasibility-aware quality & improves time-to-target on most problems.
Reinforcement learning (RL) has enabled robust quadruped locomotion over complex terrain, but most learned controllers are trained offline with backpropagation in massively parallel simulation and deployed as fixed policies, limiting adaptation to terrain variation, payload changes, actuator wear, and other real-world conditions under onboard power constraints. Local learning provides a potential path toward energy-aware on-robot adaptation by replacing global backpropagation graphs with updates driven by local neural states, making the learning rule more compatible with neuromorphic and in-memory computing substrates. This work proposes an equilibrium-propagation (EP)-based proximal policy optimization (PPO) framework for uneven-terrain quadruped locomotion. The controller combines a bio-inspired central pattern generator (CPG) policy with a residual postural adjustment policy, while replacing conventional backpropagation-trained policy and value networks with EP-enabled local learning. To train stochastic continuous-control policies with EP, we derive an EP-compatible PPO output-nudging signal and introduce a two-sided ratio clipping mechanism that stabilizes policy updates during relaxation. Experiments on a 12-DoF A1 quadruped show that the proposed controller achieves stable policy convergence in a two-stage uneven terrain locomotion task. Its locomotion performance is comparable to a backpropagation-trained PPO baseline in success rate, velocity tracking, actuator power, and body stability, while improving GPU memory efficiency by 4.3\(\times\) compared with backpropagation through time (BPTT). These results suggest that local equilibrium-based learning can support high-dimensional embodied locomotion and provide an algorithmic foundation for low-power on-robot adaptation and fine-tuning.
We introduce Evolutionary Ensemble (EvE), a decentralized framework that organizes existing, highly capable coding agents into a live, co-evolving system for algorithmic discovery. Rather than reinventing the wheel within the "LLMs as optimizers" paradigm, EvE fixes the base agent substrate and focuses entirely on evolving the cumulative guidance and skills that dictate agent behaviors. By maintaining two co-evolving populations, namely functional code solvers and agent guidance states, the system evaluates agents through a synchronous race, updating their empirical Elo ratings based on the marginal gains they contribute to the current solver state. When applied to a research bottleneck in In-Context Operator Networks (ICON), EvE autonomously discovered a robust rescale-then-interpolate mechanism that enables reliable example-count generalization. Crucially, controlled ablations reveal the absolute necessity of stage-dependent agent adaptation to navigate the shifting search landscapes of complex codebases. Compared to variants driven by a fixed initial agent or even a frozen "best-evolved" agent, EvE uniquely avoids phase mismatch, demonstrating that organizing agents into a self-revising ensemble is the fundamental driver for breaking through static performance ceilings.
This paper proposes Drain-Vortex Optimization (DVO), a population-based metaheuristic for continuous optimization. DVO models each candidate solution as a particle moving in a multi-drain vortex field. Its update rule decomposes motion into radial attraction toward selected drain centres and tangential rotation governed by a regularized free-vortex law. A three-phase mechanism switches between far-field exploration, spiral inward motion, and localized core exploitation according to the normalized distance to the assigned drain. The method also uses adaptive spiral exploitation, population-level vortex basin assignment, and optional stochastic basin switching to support structured diversity. DVO is evaluated against PSO, GWO, WOA, SCA, AOA, EO, and SVOA using a calibration--validation protocol. CEC 2022 is used only to select the final DVO configuration, while CEC 2017, classical functions, and five constrained engineering design problems are used for out-of-sample validation. On CEC 2017, DVO achieves the best mean $\log_{10}$ error on 34 of 58 cases and the best Friedman average rank (1.67), and is significantly better than every baseline under Holm-corrected Wilcoxon tests. On CEC 2022, DVO obtains the best Friedman rank (2.13) and is significantly better than five of the seven baselines; the differences against PSO and SVOA are not significant. DVO is less competitive on simple scalable classical functions and on small constrained engineering designs, which clarifies its operating regime. The algorithm is implemented in a vectorized GPU form that executes independent runs in parallel.
Spiking Neural Networks (SNNs) have been proposed as biologically plausible and energy-efficient alternatives to conventional Artificial Neural Networks (ANNs). However, the training of SNN usually relies on surrogate gradients due to the non-differentiability of the spike function, introducing approximation errors that accumulate across layers. To address this challenge, we extend the work on convexification of parallel feedforward threshold networks to parallel recurrent threshold networks, which subsume parallel SNNs as a structured special case. Building on this theoretical framework, we propose a parameter reconstruction algorithm for SNN training that demonstrates consistent and significant advantages across various tasks, both as a standalone method and in combination with surrogate-gradient training. The ablations further demonstrate the data scalability and robustness to model configurations of our training algorithm, pointing toward its potential in large-scale SNN training.
Distributed computational substrates rely on two elementary operations: bundling, the act of populating a shared physical medium with independently retrievable components, and binding, the act of composing components into outputs whose identity depends on their relations. We study these two primitives on the simplest closed substrate carrying a continuous symmetry, a cycle graph of N nodes, in two parameter regimes of a single master equation of motion. The linear regime sorts a temporal input across the substrate's U(1)-organised eigenmodes, providing a feature representation that matches a windowed-FFT baseline at high signal-to-noise ratio and modestly outperforms it for transient signals at low SNR. The Duffing regime activates a cubic mode-mixing operation constrained by the substrate's symmetry into a sparse selection rule on integer wavenumbers, generating shape-dependent harmonic content that the linear regime cannot produce. We identify a single-number observable, $\phi_0$, that summarises the bound representation's response to input shape, and we analyse its symmetry structure: a $\pi$-periodicity in the shape parameter is exact, while a time-reversal symmetry that would render $\phi_0$ degenerate is broken by the substrate's dissipation. The asymmetric status of these two symmetries is what licenses $\phi_0$ as a meaningful single-number observable; its trajectory across the quotient domain encodes the joint response of binding and dissipation to the input shape. Numerical experiments confirm that $\phi_0$ retains its information content under additive band-limited noise, with seed-averaged means staying clearly above the symmetric-attractor value down to 0 dB input SNR. The framework is developed on synthetic signals only; extensions to richer substrates, more elaborate drives, and real biological signals are open questions for the work that follows.
Spiking Neural Networks (SNNs) have gained increasing attention due to their potential for low-power computation on neuromorphic hardware. A widely adopted training strategy for SNNs is direct coding, which enable backpropagation on neuron implementations using continuous-valued surrogate activations. However, recent studies have shown that direct-coded SNNs remain substantially less energy-efficient than their event-based counterparts, limiting their practical deployment in energy sensitive scenarios. Still, to promote the reusability of pretrained SNN database on direct code, this motivates an important yet underexplored question: How can a SNN pretrained with direct code be effectively converted into an event-based representation? In this research, we present the first systematic investigation into this transfer problem, analyze the key challenges that arise when transitioning from direct-coded to event-based computation and propose a set of methods to enable energy-efficient transfer while preserving model performance.
A hallmark of life on Earth is the ability of agents to exert causal power and be drivers of subsequent events. This is key to cognition at all scales. Causal emergence, measuring the degree to which an agent exerts unique predictive power on its future, is one consequence of causal power. Indeed, recent discoveries have shown that biological agents, even minimal ones, increase their causal emergence after learning new memories. However, there is a major knowledge gap regarding how causally emergent artificial agents are. We focused on Reinforcement Learning (RL) of neural-network agents across an array of environmental conditions, encompassing different algorithms, agent architectures, and six environments arranged on a complexity spectrum. For consistency, we computed the causal emergence of their latent-space representations over their lifetimes. We used the recently proposed {\Phi}ID to estimate causal emergence and tested how it related to learning performance. Our results suggested a Causally Emergent Alignment Hypothesis: successful agents exhibited causal emergence that was consistently predictive of final reward early in training and whose representational dynamics aligned with reward improvement in most tasks. This idea suggests that causal emergence may be a previously undisclosed axis of reorganization of neural representations in RL agents, with the potential to establish causal relationships and interventions that will lead to better RL agents. Our work also highlights the alignment between causal emergence and learning as another way biological and artificial creatures compare.
Many real-world optimization problems consist of multiple tightly coupled subproblems whose solutions must be coordinated to achieve high overall performance. However, existing large language model driven automated heuristic design approaches are limited to single-problem settings. In this paper, we propose CoupleEvo. CoupleEvo proposes three evolutionary coordination strategies to evolve heuristics for coupled optimization problems: the sequential strategy evolves heuristics for one subproblem after the other; the iterative strategy alternates the evolution of heuristics for different subproblems over successive generations; and the integrated strategy evolves heuristics for all problems simultaneously. The approach is evaluated on two representative coupled optimization problems. Experimental results show that decomposition-based strategies (sequential and iterative) provide more stable convergence and higher solution quality, while the integrated evolution strategy suffers from increased search complexity and variability. These findings highlight the importance of coordinating evolutionary search across interdependent subproblems and demonstrate the potential of LLM-driven heuristic design for complex coupled optimization problems. The code is available: https://github.com/tb-git-kit-research/CoupleEvo.
High-capacity associative memory models, such as Kernel Logistic Regression (KLR) Hopfield networks, have demonstrated strong storage capabilities but typically rely on computationally expensive synchronous updates. This reliance poses a bottleneck for deployment on energy-efficient, event-driven neuromorphic hardware. In this paper, we investigate the asynchronous retrieval dynamics of KLR Hopfield networks. We show empirically that, under appropriately tuned kernel parameters, asynchronous sequential updates exhibit trajectories that are statistically indistinguishable from those of synchronous dynamics, while maintaining high recall accuracy within the tested regime for random patterns. Furthermore, we find that the asynchronous network achieves empirical storage capacities approaching $P/N \approx 30$ in static random pattern regimes, exceeding classical limits. To evaluate computational efficiency, we analyze the total number of state transitions (bit flips) required for error correction. The results show that the network converges using a number of events close to the initial Hamming distance from the target pattern, without observable spurious oscillations. These findings suggest that the large-margin attractors induced by KLR learning create a smooth energy landscape suited for sparse, event-driven computation, providing a basis for scalable and low-power associative memory on neuromorphic architectures.
High-capacity associative memory models, such as Kernel Logistic Regression (KLR) Hopfield networks, have demonstrated strong storage capabilities but typically rely on computationally expensive synchronous updates. This reliance poses a bottleneck for deployment on energy-efficient, event-driven neuromorphic hardware. In this paper, we investigate the asynchronous retrieval dynamics of KLR Hopfield networks. We show empirically that, under appropriately tuned kernel parameters, asynchronous sequential updates exhibit trajectories that are statistically indistinguishable from those of synchronous dynamics, while maintaining high recall accuracy within the tested regime for random patterns. Furthermore, we find that the asynchronous network achieves empirical storage capacities approaching $P/N \approx 30$ in static random pattern regimes, exceeding classical limits. To evaluate computational efficiency, we analyze the total number of state transitions (bit flips) required for error correction. The results show that the network converges using a number of events close to the initial Hamming distance from the target pattern, without observable spurious oscillations. These findings suggest that the large-margin attractors induced by KLR learning create a smooth energy landscape suited for sparse, event-driven computation, providing a basis for scalable and low-power associative memory on neuromorphic architectures.
Evolutionary computation has long promised to deliver both high-performance optimization tools as well as rigorous scientific simulations of Darwinian evolution. However, modern algorithms frequently abandon evolutionary fidelity for physics-inspired heuristics or superficial biological metaphors. This paper derives a suite of advanced gradient-based optimization algorithms directly from evolutionary first principles. We introduce Darwinian Lineage Simulations (DLS) to prove that, in an asexual context, Fisher's and Wright's historically opposed views of evolution are actually formally equivalent; One can partition Fisher's deterministically-evolving total population into Wright's randomly-drifting sub-populations. We prove that proper bookkeeping requires introducing a specific kind of structured noise (the DLS noise relation). Crucially, any bookkeeping choices which satisfy this relation will yield a faithful simulation of evolution. Using this vast representational freedom, we prove that a broad family of battle-tested optimization algorithms are already perfectly compatible with evolutionary dynamics. These include: Stochastic Gradient Descent as well as many regularizations/approximations of Newton's method and Natural Gradient Descent. By simply adding DLS noise (i.e., evolutionarily faithful genetic drift), these algorithms become scientifically valid in silico simulations of Darwinian evolution. Finally, we demonstrate that even the state-of-the-art Adam optimizer can be brought into evolutionary compliance through a minor mathematical surgery.
Evolutionary computation has long promised to deliver both high-performance optimization tools as well as rigorous scientific simulations of Darwinian evolution. However, modern algorithms frequently abandon evolutionary fidelity for physics-inspired heuristics or superficial biological metaphors. This paper derives a suite of advanced gradient-based optimization algorithms directly from evolutionary first principles. We introduce Darwinian Lineage Simulations (DLS) to prove that, in an asexual context, Fisher's and Wright's historically opposed views of evolution are actually formally equivalent; One can partition Fisher's deterministically-evolving total population into Wright's randomly-drifting sub-populations. We prove that proper bookkeeping requires introducing a specific kind of structured noise (the DLS noise relation). Crucially, any bookkeeping choices which satisfy this relation will yield a faithful simulation of evolution. Using this vast representational freedom, we prove that a broad family of battle-tested optimization algorithms are already perfectly compatible with evolutionary dynamics. These include: Stochastic Gradient Descent as well as many regularizations/approximations of Newton's method and Natural Gradient Descent. By simply adding DLS noise (i.e., evolutionarily faithful genetic drift), these algorithms become scientifically valid in silico simulations of Darwinian evolution. Finally, we demonstrate that even the state-of-the-art Adam optimizer can be brought into evolutionary compliance through a minor mathematical surgery.
Per-instance algorithm selection (PIAS) takes advantage of complementarity between a set of algorithms by deciding which algorithm to run on a given instance. This decision is based on features of the instances, which, in the context of black-box optimization (BBO), require a part of the optimization budget to be computed. This raises two questions: (a) from which fraction of the budget spent on feature computation does PIAS become worth it for BBO, and (b) which fraction of the budget optimizes the tradeoff between feature accuracy and PIAS performance. To this end, we perform a broad study where PIAS with varying sampling budgets for feature computation is compared to the single best algorithm on a broad range of algorithm selection scenarios. These scenarios consist of two portfolio sizes, three problem sets, 4 dimensionalities, and 10 target budgets. We find that PIAS is viable for the majority of tested scenarios, even when as much as a quarter of the total budget is spent on feature computation. The tradeoff for the fraction of the budget spent on feature computation to maximize the benefit of PIAS is highly dependent on the specific AS scenario. Further, on average 20 percent of PIAS loss to the virtual best solver is explained by the budget spent on feature computation, highlighting the importance of properly accounting for the feature budget.
Understanding the neural mechanisms underlying visual computation has long been a central challenge in neuroscience. Recent alignment based approaches have improved the accuracy of decoding visual stimuli from brain activity, yet they provide limited insight into the neural computations that give rise to these improvements. To address this gap, we propose Dual-Tower Image-Neural Alignment (DINA), an interpretable contrastive framework for analyzing population level visual computations in primary visual cortex (V1). DINA jointly trains a biologically motivated dual-tower architecture that aligns visual stimuli and corresponding V1 population responses in a shared latent space at the level of intermediate feature maps, enabling both accurate decoding and direct access to interpretable feature maps. Evaluated on large-scale two-photon calcium imaging data from mouse V1, DINA achieves accurate neural-based decoding while revealing that decoding performance is primarily supported by coarse, low-level visual structure, rather than semantic category information or fine-grained details. Further analysis reveals that alignable feature maps emerge from multiple spatially distributed image regions, capturing both shape and texture cues, and are predominantly reconstructed by sparse subsets of strongly responsive neurons and their functional interactions. Together, these results confirm that, beyond enabling accurate decoding, DINA provides a principled framework for probing the computational mechanisms underlying visual processing in V1.
Understanding how biological and artificial neural networks implement computation from connectivity is a central problem in neuroscience and machine learning. In neural systems, structural and functional connectivity are known to diverge, motivating approaches that move beyond direct connections alone. Here, we show that the spatial and temporal function of recurrent neural networks (RNNs) trained on hierarchically modular tasks can be recovered by modelling the network as a graph and analysing the multi-hop pathways between input and output units. In particular, decomposing these pathways by hop length reveals how the network temporally routes information. This perspective reframes regularisation: if function is implemented through multi-hop communication, then standard penalties such as L1 regularisation, which act only on individual weights, constrain single-hop structure rather than the multi-hop pathways that support computation. Motivated by this view, we introduce resolvent-RNNs (R-RNNs), which constrain multi-hop pathways and thereby induce temporal sparsity beyond that achieved by standard L1 regularisation. Compared with L1 regularisation, R-RNNs achieve improved performance by inducing temporal sparsity that matches the task structure, even when the task signal is sparse. Moreover, R-RNNs exhibit stronger sparsity-function alignment, reflected in their increased robustness under strong regularisation. Together, our results identify multi-hop communication as a key principle linking structure to function in recurrent networks, and suggest that sparsity should be defined over functional pathways rather than individual parameters.
Understanding how biological and artificial neural networks implement computation from connectivity is a central problem in neuroscience and machine learning. In neural systems, structural and functional connectivity are known to diverge, motivating approaches that move beyond direct connections alone. Here, we show that the spatial and temporal function of recurrent neural networks (RNNs) trained on hierarchically modular tasks can be recovered by modelling the network as a graph and analysing the multi-hop pathways between input and output units. In particular, decomposing these pathways by hop length reveals how the network temporally routes information. This perspective reframes regularisation: if function is implemented through multi-hop communication, then standard penalties such as L1 regularisation, which act only on individual weights, constrain single-hop structure rather than the multi-hop pathways that support computation. Motivated by this view, we introduce resolvent-RNNs (R-RNNs), which constrain multi-hop pathways and thereby induce temporal sparsity beyond that achieved by standard L1 regularisation. Compared with L1 regularisation, R-RNNs achieve improved performance by inducing temporal sparsity that matches the task structure, even when the task signal is sparse. Moreover, R-RNNs exhibit stronger sparsity-function alignment, reflected in their increased robustness under strong regularisation. Together, our results identify multi-hop communication as a key principle linking structure to function in recurrent networks, and suggest that sparsity should be defined over functional pathways rather than individual parameters.
Recurrent networks that store position, phase, or other continuous variables need state-space directions that remain neutral over long horizons. We give a symmetry-based account of when such neutral directions are guaranteed rather than merely tuned. For a finite-dimensional autonomous \(C^1\) vector field equivariant under a Lie group \(G\), we prove that any compact invariant set carrying a uniformly nondegenerate group-orbit bundle with stabilizer type \(H\) has, at points where the Lyapunov spectrum is defined, at least \(\dim(G/H)\) zero Lyapunov exponents tangent to the group orbit. These symmetry-protected modes have zero group-tangent growth because of exact equivariance and orbit geometry. When this protection is explicitly broken, the formerly protected direction can acquire a pseudo-gap; in our controlled breaking experiments this pseudo-gap predicts finite memory lifetime. We verify the finite-dimensional consequences with normalized equivariance error, direct group-tangent exponents, principal-angle alignment, autonomous-flow-zero controls, and orbit-dimension scaling across \(S^1\), \(T^q\), \(SO(n)\), \(U(m)\), product-group, and coupled equivariant RNN-style systems. We also train an exactly equivariant recurrent cell on velocity-input \(S^1\) path integration across six seeds and compare it with matched GRU, LSTM, and orthogonal-RNN baselines. The learned equivariant cell preserves step equivariance to \(3.2\times10^{-8}\), has a near-zero group-tangent exponent under the zero-input autonomous restriction, and improves horizon, speed, and restricted-phase generalization in this matched protocol. The learned task results are consequence evidence; the theorem-level evidence remains exact equivariance, group-tangent exponents, orbit-dimension scaling, and tangent-subspace alignment.
The popular 2009 voxel based videogame, Minecraft, contains several distinct disciplines. One of which is "parkour," gameplay that focuses on traversing a world's environment with maximum efficiency. The Minecraft online community has turned the game's physics engine into dynamic puzzles, requiring players to masterfully manipulate motion mechanics through frame precise timing of keystrokes. Actions such as sprinting, sneaking, and mouse direction are all combined to clear specific difficult jumps. Through this project, we design a genetic algorithm to generate weights for a neural network to autonomously evaluate inputs for block distances, terrain, and obstacles to determine the most optimal pathing.
Dynamic Vision Sensors (DVS) exhibit exceptional dynamic range and low power consumption, making them ideal for edge applications in the Internet of Video Things (IoVT). However, their output is often degraded by spurious Background Activity (BA) noise, leading to unnecessary computational overhead. This paper proposes SNNF, a near-sensor BA noise filter that integrates a compact Event-Based Binary Image (EBBI) representation, a parallel memory architecture, and a single-layer Spiking Neural Network (SNN) classifier. Trained on representative DVS data, the SNN distinguishes signal events from noise with an AUC of 0.89 on standard datasets. The binary-array-based EBBI eliminates timestamp dependency, significantly reducing memory footprint. Moreover, the SNN's spike-based computation replaces power-hungry multipliers with simple accumulation logic and minimizes inter-neuron data width, resulting in an extremely hardware-efficient design. FPGA implementation results show that SNNF reduces memory and logic resources to approximately 11% and 40%, respectively of state-of-the-art filters, while achieving a throughput of 29 Mega events per second (Meps). In a 65 nm CMOS ASIC implementation, SNNF achieves 44.4 Meps with an area and power consumption of only ~13% and <5% of the corresponding ANN-based designs. These results demonstrate that SNNF provides an excellent balance between filtering accuracy and hardware efficiency, making it highly suitable for resource-constrained, near-sensor deployment.
Spiking neural networks (SNNs) are promising for edge sensing due to their event-driven computation and temporal filtering capability. However, standard leaky integrate-and-fire (LIF) neurons communicate only through binary spikes, which severely limit representational capacity. Existing multi-level spiking neurons improve information transmission, but often rely on uniform quantization that mismatches membrane-potential distributions or introduces costly synaptic multiplications. In this paper, we propose ShiftLIF, a multi-level spiking neuron that maps membrane potentials to a logarithmically spaced power-of-two spike set. This design provides finer representation in the small-amplitude regime, where membrane potentials are densely concentrated, while enabling multiplier-free synaptic computation through bit-shift and accumulation operations. As a result, ShiftLIF improves spike-level expressiveness without sacrificing the hardware-friendly nature of standard SNN computation. We evaluate ShiftLIF on 10 datasets spanning wireless, acoustic, motion, and visual sensing tasks. Results show that ShiftLIF consistently matches or exceeds the accuracy of existing multi-level spiking neurons while maintaining synaptic energy consumption close to standard binary LIF. These results indicate that ShiftLIF provides a favorable accuracy-efficiency trade-off for cross-modal edge sensing.
Associative memory or content-addressable memory is an important component function in computer science and information processing, and at the same time a key concept in cognitive and computational brain science. Many different neural network architectures and learning rules have been proposed to model the brain's associative memory while investigating key component functions like figure-ground segmentation, perceptual reconstruction and rivalry. A less investigated but equally important capability of associative memory is prototype extraction where the training set comprises distorted prototype instances and the task is to recall the correct generating prototype given a new distorted instance. In this paper we benchmark associative memory function of seven different Hebbian learning rules employed in non-modular and modular recurrent networks with winner-take-all dynamics operating on moderately sparse binary patterns. We measure pattern storage and weight information capacity, prototype extraction capabilities, and sensitivity to correlations in data. The original additive Hebb rule comes out with worst capacity, covariance learning proves to be robust but with moderate capacity, and the Bayesian-Hebbian learning rules show highest capacity in almost all different conditions tested.
Sequence learning reduces to similarity-based retrieval over a temporally indexed representation space, a constraint on any sequence model, not a property of a specific architecture. We show that a spiking Sparse Distributed Memory sequence machine (2007) and the transformer (2017) independently instantiate the same five functional operations (encoding, context maintenance, associative retrieval, storage, and decoding), with cosine similarity as the shared retrieval primitive in both. We formalise a Phase-Latency Isomorphism showing that sinusoidal positional phase and spike timing are linearly related, and prove that dot product attention is invariant to this mapping up to a global scale factor on the positional component (Lemma 1). Empirically, frequency-compressed positional encoding fails to converge on a positionally demanding copy task, while a learned rank-based embedding matches or exceeds sinusoidal encoding, indicating that the critical property for positional representation is distance discriminability under dot-product similarity, not sinusoidal form. Time, phase, and rank are three instantiations of the same computational primitive, an ordered index whose structure survives similarity-based retrieval.
Spiking Neural Networks (SNNs) provide a promising framework for energy-efficient and biologically grounded computation; however, scalable learning in deep recurrent architectures with sparse connectivity remains a major challenge. In this work, we propose a structured multi-layer recurrent SNN architecture composed of locally dense recurrent layers augmented with sparse small-world long-range projections to a readout population. The long-range connectivity is largely fixed, preserving routing efficiency and hardware scalability, while synaptic adaptation is performed using strictly local plasticity mechanisms. To enable supervised learning without backpropagation or surrogate gradients, we introduce a biologically motivated learning framework that combines: (i) population-based winner-take-all (WTA) teaching signals at the output layer, (ii) fixed random broadcast alignment feedback pathways, and (iii) low-dimensional modulatory neuron populations that gate synaptic updates through three-factor learning rules with eligibility traces. This design supports deep recurrent computation with sparse global communication and purely local synaptic updates. We analyze the algorithmic properties, computational complexity, and hardware feasibility of the proposed approach, and demonstrate stable learning and competitive performance on benchmark classification tasks. The results highlight the potential of structured recurrence and neuromodulatory learning to enable scalable, hardware-compatible SNN training beyond gradient-based methods.
High-capacity associative memories based on Kernel Logistic Regression (KLR) exhibit strong storage capabilities, but the dynamical and geometric mechanisms underlying their stability remain poorly understood. This paper investigates the global geometry of attractor basins and the mechanisms governing the storage limit in KLR-trained Hopfield networks. We combine empirical evaluations using random sequences and real-world image embeddings (CIFAR-10) with morphing experiments and statistical Signal-to-Noise Ratio (SNR) analysis. Our experiments show that the network achieves a storage capacity for random sequences up to $P/N \approx 16$, while maintaining stable retrieval for structured data at effective loads near $P/N \approx 20$. Morphing analysis indicates that attractors on the "Ridge of Optimization" are separated by sharp, phase-transition-like boundaries, characterized by steep effective potential barriers and critical slowing down. Furthermore, by comparing an SNR analysis with a geometric reference point inspired by Cover's theorem, we show that the practical storage limit is governed primarily not by a lack of geometric separability in the feature space, but by the loss of dynamical stability against crosstalk noise. These findings suggest that KLR networks function as highly localized exemplar-based memories that operate near the onset of dynamical collapse, providing a useful perspective on the design of robust, large-scale retrieval systems.
High-capacity associative memories based on Kernel Logistic Regression (KLR) exhibit strong storage capabilities, but the dynamical and geometric mechanisms underlying their stability remain poorly understood. This paper investigates the global geometry of attractor basins and the mechanisms governing the storage limit in KLR-trained Hopfield networks. We combine empirical evaluations using random sequences and real-world image embeddings (CIFAR-10) with morphing experiments and statistical Signal-to-Noise Ratio (SNR) analysis. Our experiments show that the network achieves a storage capacity for random sequences up to $P/N \approx 16$, while maintaining stable retrieval for structured data at effective loads near $P/N \approx 20$. Morphing analysis indicates that attractors on the "Ridge of Optimization" are separated by sharp, phase-transition-like boundaries, characterized by steep effective potential barriers and critical slowing down. Furthermore, by comparing an SNR analysis with a geometric reference point inspired by Cover's theorem, we show that the practical storage limit is governed primarily not by a lack of geometric separability in the feature space, but by the loss of dynamical stability against crosstalk noise. These findings suggest that KLR networks function as highly localized exemplar-based memories that operate near the onset of dynamical collapse, providing a useful perspective on the design of robust, large-scale retrieval systems.
High-capacity associative memories based on Kernel Logistic Regression (KLR) exhibit strong storage capabilities, but the dynamical and geometric mechanisms underlying their stability remain poorly understood. This paper investigates the global geometry of attractor basins and the mechanisms governing the storage limit in KLR-trained Hopfield networks. We combine empirical evaluations using random sequences and real-world image embeddings (CIFAR-10) with morphing experiments and statistical Signal-to-Noise Ratio (SNR) analysis. Our experiments show that the network achieves a storage capacity for random sequences up to $P/N \approx 16$, while maintaining stable retrieval for structured data at effective loads near $P/N \approx 20$. Morphing analysis indicates that attractors on the "Ridge of Optimization" are separated by sharp, phase-transition-like boundaries, characterized by steep effective potential barriers and critical slowing down. Furthermore, by comparing an SNR analysis with a geometric reference point inspired by Cover's theorem, we show that the practical storage limit is governed primarily not by a lack of geometric separability in the feature space, but by the loss of dynamical stability against crosstalk noise. These findings suggest that KLR networks function as highly localized exemplar-based memories that operate near the onset of dynamical collapse, providing a useful perspective on the design of robust, large-scale retrieval systems.
In this paper an attractor FCM is created, tested, and analyzed. This FCM is neither a hebbian based nor agentic, nor a hybrid; it rather is a gradient descent based, physics constrained, Jacobian version of an FCM. Moreover, this model has several quirks; it uses residual memory, back propagation through time, and a fixed point anchor that is recursively implemented to update its weights. The residuals update the recursive part without losing the system memory. The model's anchor enables it to converge in a fixed point for which back propagation through time unrolls it and ensures that the error minimization is for an accurate gradient. Furthermore, a new learning algorithm is utilized. The Newton's method finds the system's fixed point attractor and then gradient descend is adaptively changing the landscape; an adaptive term is used to directly manipulate the weights through the attractor dynamics. As the adaptive term changes, the descent through the landscape is constantly adjusting according to sigmoid saturation, and that prevents premature convergence to a local minimum. Lastly, the updates are filtered by causal mask that informs the network about the physics, respecting the initial expert based opinions, for which model reduces the error to the target in an efficient way.
Modeling invasive neural spike data is fundamental to advancing high-performance brain-computer interfaces (BCIs). However, existing approaches face critical challenges, including limited-scale heterogeneous data, cross-domain distribution shift, and the intrinsic spatiotemporal complexity of invasive neural signals. In this work, we propose UniBCI, a unified pretrained model for invasive Brain-Computer Interfaces. The model integrates three key components: (1) a context-conditioned spatio-temporal tokenization (CST) scheme that embeds neural signals together with metadata into a shared representation space; (2) a hierarchical Interval-Area Attention (IAA) mechanism that captures patterns of spike dynamics in slots via linear attention and locality dependencies via sliding-window attention; and (3) a scalable self-supervised masked signals reconstruction objective for learning generalizable neural representations from large-scale unlabeled data. We construct a pretraining corpus spanning multiple species, subjects, brain regions, and behavioral experiment paradigms. These heterogeneous recordings are standardize via our proposed unified normalization and tokenization. Comprehensive experiments demonstrate that UniBCI achieves SOTA performance across diverse downstream tasks while improving generalization. Moreover, the model achieves a strong balance between accuracy and efficiency, with fewer trainable parameters and lower inference latency. These results suggest that UniBCI provides a practical step toward general-purpose neural foundation models, enabling robust, scalable, and transferable representation learning for invasive neural data. The code for this paper is available at: https://anonymous.4open.science/r/UniBCI-C805.
Expensive optimization problems (EOPs) are black-box tasks with costly objective evaluations and no gradient access, making the evaluation budget the key bottleneck. Surrogate-assisted evolutionary algorithms (SAEAs) reduce evaluations via surrogate predictions, but conventional surrogates often require frequent retraining as populations evolve, incurring overhead. This paper proposes R2SAEA, a reinforcement-trained relation-based large language model (LLM) surrogate assisted evolutionary algorithm. We cast relation-based surrogate modeling as an in-context pairwise reasoning task. To enable efficient inference in evolutionary loops, we develop an anchor-based iterative context construction strategy that reduces prompt complexity from quadratic to linear in population size, and a voting-based aggregation scheme that converts predicted relations into scores for offspring selection. We further build an RL pipeline from evolutionary trajectories and fine-tune Qwen2.5 with GRPO. Experiments on single- and multi-objective benchmarks show improved relation prediction and state-of-the-art optimization performance over strong SAEA baselines and general LLMs. Quantization also enables efficient edge deployment, supporting a zero-shot surrogate paradigm without per-generation retraining. Code and models are available at https://github.com/Septend9/R2SAEA.
This paper proposes RCMAES, a novel variant of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) for CEC benchmark optimization. RCMAES integrates a dimension-dependent nonlinear population-size reduction strategy with an adaptive restart mechanism within a pure CMA-ES framework. RCMAES is evaluated on three benchmark suites (CEC2017, CEC2020, and CEC2022) and compared with state-of-the-art DE algorithms as well as its closely related counterpart, BIPOP-aCMAES. Experimental results show that RCMAES achieves competitive and robust performance across all benchmarks.
We present a Spatially Embedded Evolutionary Algorithm where robot individuals exist in a physically simulated 2D environment, must navigate to encounter potential mates, and compete for survival under various spatially-aware selection pressures. Using HyperNEAT evolved neural controllers for ARIEL gecko-inspired quadrupeds in MuJoCo, we investigate how spatial structure fundamentally alters evolutionary dynamics. Our experiments show a modest 4.9% difference in peak fitness between proximity-based and random pairing possibly within stochastic variation while combining spatial parent selection with stochastic death selection produces unstable population dynamics. We discover a continuous phase transition in energy-based selection experiments, with critical zone count separating extinction-dominated and explosion-dominated regimes. Our density-dependent death selection mechanism achieves 97% completion rates but causes fitness decline, revealing a fundamental dilemma where decoupled mechanisms produce bistable dynamics, positively coupled mechanisms create counter-selection pressures, and only deterministic fitness-based selection maintains stability. These findings provide important constraints for future spatial EA design.
This paper presents an application of the biologically realistic JASTAP neural network model to classification tasks. The JASTAP neural network model is presented as an alternative to the basic multi-layer perceptron model. An evolutionary procedure previously applied to the simultaneous solution of feature selection and neural network training on standard multi-layer perceptrons is extended with JASTAP model. Preliminary results on IRIS standard data set give evidence that this extension allows the use of smaller neural networks that can handle noisier data without any degradation in classification accuracy.
We propose EdgeSpike, a co-designed spiking neural network (SNN) framework for autonomous low-power sensing in edge Internet of Things (IoT) architectures. EdgeSpike unifies (i) a hybrid surrogate-gradient and direct-encoding training pipeline, (ii) a hardware-aware neural architecture search (NAS) bounded by per-inference energy and memory budgets, (iii) an event-driven runtime targeting Intel Loihi 2, SpiNNaker 2, and commodity ARM Cortex-M microcontrollers with custom spike-sparse SIMD kernels, and (iv) a lightweight local plasticity rule enabling continual on-device adaptation without backpropagation. The framework is evaluated across five sensing tasks (keyword spotting, vibration-based machine fault detection, surface electromyography gesture recognition, 77 GHz radar human-activity classification, and structural-health acoustic-emission monitoring) on three hardware targets. EdgeSpike achieves a mean classification accuracy of 91.4%, within 1.2 percentage points (pp) of strong INT8 convolutional neural network (CNN) baselines (mean 92.6%), while reducing energy per inference by 18x to 47x on neuromorphic hardware (mean 31x) and by 4.6x to 7.9x on Cortex-M (mean 6.1x). End-to-end latency remains at or below 9.4 ms across all 15 task-hardware configurations. A seven-month, 64-node wireless field deployment confirms a 6.3x extension in projected battery lifetime (from 312 to 1978 days at 2 Wh per node) and bounded accuracy degradation under seasonal drift (0.7 pp with on-device adaptation versus 2.1 pp without). Hardware-aware NAS evaluates 8400 candidates and yields a 12-point Pareto front. EdgeSpike will be released as open source with reproducible training pipelines, hardware-portable runtimes, and benchmark suites.
Stopping criteria automatically determine when to stop an evolutionary algorithm, so as not to waste function evaluations on a stagnant population. Although stopping criteria play an important role in real-world applications, they have attracted little attention in the evolutionary multi-objective optimization (EMO) community. In fact, new stopping criteria for EMO have been rarely developed in recent years. One reason for the stagnation in developing stopping criteria for EMO is a lack of effective benchmarking methodologies. To address this issue, this paper proposes (i) a performance measure of stopping criteria for EMO and (ii) a file-based benchmarking approach. This paper also proposes (iii) a data representation method that effectively stores population states in text files. (i) The proposed measure represents the performance of stopping criteria as a single scalar value, making comparison easy. (ii) The proposed file-based approach not only simplifies the benchmarking process but also facilitates reproducibility. (iii) The proposed data representation method addresses the issue of file size in (ii). We demonstrate the effectiveness of our three contributions (i)--(iii) by benchmarking five representative stopping criteria for EMO.
The Beagle framework, through GPU-based Genetic Programming, enables population dynamics previously unattainable (within practical time frames) by CPU-constrained Genetic Programming systems. This work explores how GPU-enabled population sizes impact the success of training for symbolic regression problems. Specifically, when using constant population sizes, we see benefits of using very narrow and deep searches (as narrow as 1000 individuals) for some problems, while other problems benefit from very broad and shallow searches (as broad as 10 million individuals). We also explore stepped population sizes that start with large populations and drop to small populations to balance the breadth and depth of search.
Evolutionary methods have long been useful for analysis and explanation in genetics, biology, ecology, and related fields. In this work, we extend these methods to neural networks, specifically large language models (LLMs), to better analyze and explain relationships among models. We show how relating weights to genotypes and output text to phenotypes can improve our understanding of model lineage, important datasets, the roles of different model layers, and visualization of model relationships. We demonstrate this in a controlled experiment, where our estimated evolutionary trees reliably recover the topology of the ground-truth training tree. We further identify the most important weight layers according to weight differences and show through phenotypic experiments that one training dataset appears to contribute more useful information than the others. Finally, we generate an unsupervised evolutionary tree of black-box foundation models. Throughout, we provide visualizations that support a clearer understanding of evolutionary relationships among LLMs.
Multiobjective optimization remains challenging for many scientific and engineering problems due to the need to balance convergence, diversity, and computational efficiency across high-dimensional objective landscapes. This work presents the Multiobjective Animorphic Ensemble Optimization (MAEO) framework, a parallelizable ensemble strategy that unifies state-of-the-art evolutionary algorithms within an island-based architecture, overcoming the limitations of relying on a single optimizer, as implied by the No Free Lunch theorem. MAEO uses a parameter-free hypervolume indicator for island performance assessment and a strict Pareto-rank-based individual scoring formulation that incorporates crowding distance and nadir-point proximity to ensure consistent selection pressure within each front. The framework is initiated using four algorithms (NSGA-III, CTAEA, AGEMOEA2, SPEA2) and evaluated through extensive benchmarking on 12 DTLZ/ZDT functions under 36 dimensionality settings using Wilcoxon signed-rank tests with both hypervolume and inverse generational distance metrics. Results show that MAEO achieves balanced convergence-diversity performance, outperforming or matching some of the leading multiobjective optimization algorithms across different benchmark problems. To demonstrate practical applicability, MAEO is applied to the equilibrium-cycle optimization of a small modular nuclear reactor. Eight discrete design variables (and three objectives (levelized cost of electricity, peak soluble boron concentration, fuel cycle length) are optimized under two safety constraints. The algorithm carried out roughly 40000 evaluations using computer simulations. MAEO identifies core designs that lower both the levelized cost of electricity and the peak boron concentration, while preserving fuel cycle length and meeting all safety constraints.
Spiking Neural Networks (SNNs) have garnered increasing attention as one of bio-inspired models due to their great potential in neuromorphic computing and sparse computation. Many practical algorithms and techniques have been developed; however, theoretical understandings of the generalization, that is, the extent to which SNNs perform well on unseen data, are far from clear. Recent advances disclosed an excitation-dependent and architecture-related generalization bound such that the Rademacher complexity of SNNs with stochastic firing can be upper bounded by an exponential function relative to the excitation probability and the architecture depth. In this paper, we theoretically investigate the generalization bounds of SNNs with several integration-and-fire schemes via Rademacher complexity. We recognize that the empirical Rademacher complexity of SNNs is close to the SNN configurations, which is exponential to the network depth and the maximum time duration of received spike sequences, superlinear and subquadratic to the network width, polynomial to the parameter norm, inverse-linear to the number of training samples, and independent of the computations within spiking neurons, achieving a more precise rate than conventional studies. Our theoretical results may support the scope of SNN theories and shed some insight into the development of SNNs.
Symbolic regression discovers mathematical formulas from data. Some methods fix a tree of operators, assign learnable weights, and train by gradient descent. The tree's structure, which determines what operators and variables appear at each position, is chosen once and applied to every target. This paper tests whether that choice affects which targets are actually recovered. Three structures are compared, all sharing the same operator and target language but differing in how variables enter the tree; one is strictly more expressive. Across over 12,700 training runs, one structure recovers a target at 100% while another scores 0%, and the ranking reverses on a different target. Expressiveness guarantees that a solution exists in the search space, but not that gradient descent finds it: the most expressive structure fails on targets that a restricted alternative solves reliably. Switching the operator changes which targets succeed; reversing its gradient profile collapses recovery entirely. Balanced (non-chain) tree shapes are never recovered. These findings show that the optimization landscape, not expressiveness alone, determines what gradient-based symbolic regression recovers.
Spiking Neural Networks (SNNs) offer a biologically inspired foundation for low-power, event-driven intelligence, yet their direct on-chip supervised training remains a key hardware challenge. This paper presents a multiplication-free, spike-time-based learning algorithm specifically designed for efficient FPGA realization. The proposed approach eliminates floating-point arithmetic and explicit gradient storage, enabling a fully event-driven, digital training pipeline. Implemented on a Xilinx Artix-7 FPGA, the architecture achieves high operating speed and minimal resource usage while maintaining competitive accuracy. These results demonstrate that the learning algorithm effectively maps onto reconfigurable hardware, achieving both computational and energy efficiency. Software simulations further validate scalability, with 96.5\% and 84.8\% accuracy on MNIST and Fashion-MNIST. With its spike-driven and multiplier-free operation, the proposed framework delivers a practical and scalable hardware solution for real-time, on-chip SNN learning in edge environments.
Objective: Decoding visual information from electroencephalography (EEG) is an important problem in neuroscience and brain-computer interface (BCI) research. Existing methods are largely restricted to natural images and categorical representations, with limited capacity to capture structural features and to differentiate objective perception from subjective cognition. We propose a Structure-Guided Diffusion Model (SGDM) that incorporates explicit structural information for EEG-based visual reconstruction. Approach: SGDM is evaluated on the Kilogram abstract visual object dataset and the THINGS natural image dataset using a two-stage generative mechanism. The framework combines a structurally supervised variational autoencoder with a spatiotemporal EEG encoder aligned to a visual embedding space via contrastive learning. Structural information is integrated into a diffusion model through ControlNet to guide image generation from EEG features. Results: SGDM outperforms existing methods on both abstract and natural image datasets. Reconstructed images achieve higher fidelity in low-level visual features and semantic representations, indicating improved decoding accuracy and strong generalization across diverse visual domains. Spatiotemporal analysis of EEG signals further reveals hierarchical structural encoding patterns, consistent with the neural dynamics of visual cognition. Significance: These findings validate the effectiveness of SGDM in capturing explicit structural geometry and generating images with high fidelity to individual cognitive representations. By enabling decoding of complex visual content from EEG signals, the framework extends neural decoding beyond low-dimensional or categorical outputs. This supports BCIs with increased degrees of freedom for intention decoding and more flexible brain-to-machine communication.
An artificial world of barriers and plains scattered with food is used to test the feasibility of using genetic algorithms to optimize hebbian neural networks to perform on problems without apriori knowledge of the problem domain. A formal L-System based genetic alphabet for neural networks, titled Lsys, and a neural network genetic modeling tool titled Wp1hgn are introduced. Lsys and Matrix neural network topology genetic encoding methods are compared across 24 experimental runs. Lsys encoding achieved a mean maximum food count of 3802 +- 197 at generation 1000 across 8 runs with varied parameters, compared to 1388 +- 610 for Matrix encoding, a 2.74x performance advantage with an 8.5-fold improvement in consistency as measured by coefficient of variation (5.2% vs 44.0%). All 8 Lsys populations successfully learned to navigate the environment, while 4 of 8 Matrix populations failed to achieve competitive performance at any point during 1000 generations. When transferred to a novel maze environment, Lsys populations demonstrated immediate robust generalization, achieving a mean maximum food count of 2455 +- 176 compared to 422 +- 212 for Matrix populations, a 5.82x advantage that exceeded the training world performance gap. A MatrixLSG control condition, in which initial populations were generated using Lsys genotypes and then evolved using Matrix operators, demonstrated that the performance advantage of Lsys encoding derives primarily from the genetic algorithm operating on the compressed symbolic Lsys alphabet throughout evolution rather than from initial population structure. Lsys encoding is shown to provide faster convergence, higher peak performance, dramatically greater reliability, and superior generalization to novel environments compared to Matrix encoding across all experimental conditions tested.
Parametrically driven oscillators provide a natural platform for neuromorphic computation, where nonlinear mode coupling and intrinsic dynamics enable both memory and high-dimensional transformation. Here, we investigate a two-mode system exhibiting 2:1 parametric resonance and demonstrate its operation as a reservoir computer across distinct dynamical regimes, including sub-threshold, parametric resonance, and frequency-comb states. By encoding input signals into the drive amplitude and sampling the resulting temporal and spectral responses, we perform one step-ahead prediction of benchmark chaotic systems, including Mackey-Glass, Rossler, and Lorenz dynamics. We find that optimal computational performance is achieved within the parametric resonance regime, where nonlinear interactions are activated while temporal coherence is preserved. In contrast, although frequency-comb states introduce increased spectral dimensionality, their performance is not consistently good across their existence band and also degrades in the chaotic comb regime due to loss of phase coherence. Mapping prediction error over parameter space reveals a direct correspondence between computational capability and the underlying bifurcation structure, with low-error regions aligned with the parametric resonance boundary. We further show that the input modulation, the detuning from the frequency matching condition, damping ratio, and input data rate systematically control the accessible dynamical regimes and thereby the computational performance. These results establish parametric resonance as a robust operating regime for oscillator-based reservoir computing and provide design principles for tuning physical systems toward optimal neuromorphic functionality.
Reservoir computing (RC) is an emerging recurrent neural network architecture that has attracted growing attention for its low training cost and modest hardware requirements. Memristor-based circuits are particularly promising for RC, as their intrinsic dynamics can reduce network size and parameter overhead in tasks such as time-series prediction and image recognition. Although RC has been demonstrated with several memristive devices, a comprehensive evaluation of device-level requirements remains limited. In this paper, we analyze and explain the operation of a parallel delayed feedback network (PDFN) RC architecture with volatile memristors, focusing on how device characteristics -- such as decay rate, quantization, and variability -- affect reservoir performance. We further discuss strategies to improve data representation in the reservoir using preprocessing methods and suggest potential improvements. The proposed approach achieves 95.89% classification accuracy on MNIST, comparable with the best reported memristor-based RC implementations. Furthermore, the method maintains high robustness under 20% device variability, achieving an accuracy of up to 94.2%. These results demonstrate that volatile memristors can support reliable spatio-temporal information processing and reinforce their potential as key building blocks for compact, high-speed, and energy-efficient neuromorphic computing systems.
Local Optima Networks (LONs) represent the global structure of search spaces as graphs, but their construction requires iterative execution of a search algorithm to find local optima and approximate transitions between Basins of Attraction (BoAs). In continuous optimization, this high computational cost prevents systematic investigation of the relationship between LON features and evolutionary algorithm performance. To address this issue, we propose an alternative definition of BoAs for Max-Set of Gaussians (MSG) landscapes with explicitly tunable multimodality. This bypasses search-based BoA identification, enabling low-cost LON construction. Moreover, we leverage Novelty Search (NS) to explore the parameter space of the MSG landscape generator, producing instances with diverse graph topologies. Our experiments show that the proposed BoAs closely align with gradient-based BoAs, and that NS successfully generates instances with varied search difficulty and connectivity patterns among optima. Finally, over the instances generated by NS, we predict the success rate of two well-established evolutionary algorithms from LON features. While our LON construction is specific to MSG landscapes, the proposed framework provides a dataset that serves as a foundation for landscape-aware optimization.
Realistic neuron types and plasticity rules enable cross-seed convergence and selective recall that minimal Hopfield baselines lack.
abstractclick to expand
We present a biologically detailed extension of the classical Hopfield/Marr auto-associative memory model for CA3, implementing ten populations (two asymmetric pyramidal subtypes, eight GABAergic interneuron classes), forty-seven compartments, multi-rule plasticity (recurrent Hebb, BCM anti-saturation, mossy-fiber short-term, endocannabinoid iLTD, burst-gated Hebb), and a bimodal cholinergic encoding/consolidation cycle. Evaluated on pattern completion across auto-associative, associative, and temporal regimes, and on a controlled inhibitory-proportion manipulation at $N{=}256$, the full architecture exhibits \emph{three qualitative signatures absent from a minimal Hopfield baseline}: (i)~multi-attractor cross-seed behaviour at $K{=}5$ with biologically realistic inhibitory proportions, where two of five seeds converge to positive attractors with margin ${+}0.10{-}0.22$ (Cohen's $d{=}0.71$, one-sided $p{=}0.08$); (ii)~target-selective associative recall in paired $(A, B)$ memory at $K{\geq}5$, where the full model retrieves $B$ from a partial cue of $A$ while the minimal model echoes $A$ (Pearson margin $\Delta{=}{+}0.163$ at $K{=}5$); (iii)~reduced cross-seed variance of the full model below the minimal baseline under clean upstream, with ratios $1.0{-}3.0$. These three signatures are architecture-specific: they appear consistently across independent regimes and are absent from the minimal control.
We establish a mathematical correspondence between state space models, a state-of-the-art architecture for capturing long-range dependencies in data, and an exactly solvable nonlinear oscillator network. As a specific example of this general correspondence, we analyze the diagonal linear time-invariant implementation of the Structured State Space Sequence model (S4). The correspondence embeds S4D, a specific implementation of S4, into a ring network topology, in which recent inputs are encoded, as waves of activity traveling over the one-dimensional spatial layout of the network. We then derive an exact operator expression for the full forward pass of S4D, yielding an analytical characterization of its complete input-output map. This expression reveals that the nonlinear decoder in the system induces interactions between these information-carrying waves that enable classifying real-world sequences. These results generalize across modern SSM architectures, and show that they admit an exact mathematical description with a clear physical interpretation. These insights enable a new level of interpretability for these systems in terms of nonlinear oscillator networks.
We extend our gauge-covariant stochastic neural-field framework by promoting architecture-level parameters to slow stochastic variables evolving in function space. Our effective theory is formulated in terms of classical commuting fields and provides symmetry-constrained diagnostics of marginality and finite-width effects through the maximal Lyapunov exponent, the amplification factor, and dressed spectral kernels. On top of this dynamics, we introduce a Markovian evolutionary scheme compatible with the local $U(1)$ structure of the effective model. By using a minimal implementation, the genotype is reduced to the weight-variance parameter $\sigma_w^2$, and the fitness functional combines spectral agreement, marginal stability, and a symmetry-constrained critical anchor. Comparing three evolutionary models, we find that only the fully symmetry-constrained Ginibre $U(1)$ version robustly approaches a narrow near-marginal regime and reproduces the predicted low-frequency finite-width spectral behavior. These results support the use of symmetry-guided effective stability diagnostics as practical principles for stochastic architecture search in controlled settings.
High-capacity associative memories based on Kernel Logistic Regression (KLR) achieve strong retrieval performance but typically require substantial computational resources. This paper investigates the compressibility of KLR Hopfield networks to clarify the geometric principles underlying their robust representations. We present a geometric interpretation based on spontaneous symmetry breaking and Walsh analysis, and examine it through compression experiments involving quantization and pruning. The experiments reveal a clear asymmetry: the network remains robust under low-precision quantization while exhibiting strong sensitivity to pruning. We interpret this behavior through a "sparse function, dense representation" principle, in which a sparse input mapping is implemented through a dense bimodal parameterization. These findings suggest a practical route toward hardware-efficient kernel associative memories and provide insight into the geometric principles underlying robust representation in neural systems.
High-capacity associative memories based on Kernel Logistic Regression (KLR) achieve strong retrieval performance but typically require substantial computational resources. This paper investigates the compressibility of KLR Hopfield networks to clarify the geometric principles underlying their robust representations. We present a geometric interpretation based on spontaneous symmetry breaking and Walsh analysis, and examine it through compression experiments involving quantization and pruning. The experiments reveal a clear asymmetry: the network remains robust under low-precision quantization while exhibiting strong sensitivity to pruning. We interpret this behavior through a "sparse function, dense representation" principle, in which a sparse input mapping is implemented through a dense bimodal parameterization. These findings suggest a practical route toward hardware-efficient kernel associative memories and provide insight into the geometric principles underlying robust representation in neural systems.
As LLMs continue to shape real-world applications, automated jailbreak generation becomes essential to reveal safety weaknesses and guide model improvement. Existing automatic jailbreak generation methods have not yet fully considered two important aspects: adaptability to evolving safety-finetuned models, which affects their effectiveness on newer model versions, and diversity in generated prompts, which can cause narrow or repetitive attack patterns. To address these issues, we propose EvoJail, an instruction-fusion-driven evolutionary jailbreak generation framework that formalizes jailbreak prompt generation as a multi-objective black-box optimization problem and leverages the principles of evolutionary algorithms to search for jailbreak prompts that can adapt across different model versions and exhibit diverse attack patterns. Specifically, EvoJail integrates jailbreak prompt generation into an iterative evolutionary loop, where at each iteration candidate prompts are evaluated directly against the target model and then selected and varied based on the target model's responses, enabling the generation process to continuously adapt to model updates. To enhance diversity, EvoJail introduces field-aware instruction fusion to construct diverse starting points and incorporates diversity-aware objectives into the evolutionary fitness function, guiding the search toward prompts with richer semantic variation, while further designing multi-level LLM-based mutation operators that modify prompt structures at different granularities to promote structural diversity throughout the evolutionary process. Results demonstrate that EvoJail has stronger adaptability and can achieve over $93\%$ attack success rate and more than $5.6\%$ improvement in diversity metrics over state-of-the-art methods.
Standard transformer architectures learn fixed slow-weight representations during training and lack mechanisms for rapid adaptation within an episode. In contrast, biological neural systems address this through fast synaptic updates that form transient associative memories during inference, a property known as Hebbian plasticity. In this paper, we conduct an empirical study of Hebbian Fast-Weight (HFW) modules integrated into multiple transformer backbones, including ViT-Small, DeiT-Small, and Swin-Tiny. We evaluate six model variants: ViT, DeiT, Swin, ViT-Hebbian, DeiT-Hebbian, and Swin-Hebbian on 5-way 1-shot and 5-way 5-shot classification tasks using the Omniglot benchmark under a Prototypical Network meta-learning framework. We propose a single module placement strategy for Swin-Tiny in which one HFW module is applied to the final stage feature map after all hierarchical stages have completed. This design avoids the training instability caused by placing separate Hebbian modules at each stage and achieves the highest test accuracy across all six models (96.2\% at 1-shot; 99.2\% at 5-shot), outperforming its non-Hebbian baseline by $+0.3$ percentage points at 1-shot. We analyze the interaction between Swin's shifted window inductive bias and episode-level Hebbian binding, discuss why per-block placement fails for ViT and DeiT variants in a low-data regime, and situate the results within the wider literature on fast and slow-weight meta-learning.
Memristive devices present a promising foundation for next-generation information processing by combining memory and computation within a single physical substrate. This unique characteristic enables efficient, fast, and adaptive computing, particularly well suited for deep learning applications. Among recent developments, the memristive-friendly echo state network (MF-ESN) has emerged as a promising approach that combines memristive-inspired dynamics with the training simplicity of reservoir computing, where only the readout layer is learned. Building on this framework, we propose memristive-friendly parallelized reservoirs (MARS), a simplified yet more effective architecture that enables efficient scalable parallel computation and deeper model composition through novel subtractive skip connections. This design yields two key advantages: substantial training speedups of up to 21x over the inherently lightweight echo state network baseline and significantly improved predictive performance. Moreover, MARS demonstrates what is possible with parallel memristive-friendly reservoir computing: on several long sequence benchmarks our compact gradient-free models substantially outperform strong gradient-based sequence models such as LRU, S5, and Mamba, while reducing full training time from minutes or hours down seconds or even only a few hundred milliseconds. Our work positions parallel memristive-friendly computing as a promising route towards scalable neuromorphic learning systems that combine high predictive capability with radically improved computational efficiency, while providing a clear pathway to energy-efficient, low-latency implementations on emerging memristive and in-memory hardware.
Molecular biology features numerous complexes of proteins that coordinate in an interlocking fashion to fulfill different functions. Adaptive evolution explains some of this complexity, but needn't be the default when neutral explanations suffice. A new artificial life model ``organism,'' the Quandary Den, is introduced to explore different neutral evolution scenarios where complexity increases in the absence of greater informational needs. Two interlocking complexity scenarios emerge. Subfunctionalization leads to functionality diffusing through the complex. Masking allows intracomplex interference to accumulate genetically, requiring that it be blocked at the level of expression.
In black-box optimization, a central question is which algorithm to use to solve a given, previously unseen, problem. Selecting a single algorithm, however, entails inherent risks: inaccuracies in the selector may lead to poor choices, and even well-performing algorithms with high variance can yield unsatisfactory results in a single run. A natural remedy is to split the evaluation budget across multiple runs of potentially different algorithms. Such sequential algorithm portfolios benefit from variance reduction and complementarities between algorithms, often outperforming approaches that allocate the entire budget to a single solver.
While effective portfolios can be constructed post-hoc, transferring this idea to the algorithm selection setting is non-trivial. We show that a naive portfolio constructed over the full training set already outperforms the strongest traditional baseline, the virtual best solver. We then propose a simple yet effective k-nearest-neighbor-based finetuning approach to construct portfolios tailored to unseen instances, yielding further improvements and highlighting the effectiveness of portfolio selection in fixed-budget black-box optimization.
Scalability of evolutionary algorithms refers to assessing how their performance changes as problem size increases. In the area of multi-objective optimisation, research on the scalability of multi-objective evolutionary algorithms (MOEAs) has predominantly focussed on continuous problems. However, multi-objective combinatorial optimisation problems (MOCOPs) differ from continuous ones. Their discrete and rigid structure often brings rugged landscape, numerous local optimal solutions and disjoint global optimal regions. This leads to different behaviour of MOEAs. For example, SEMO, a simple MOEA without mating selection and diversity maintenance mechanisms, has been shown to be highly competitive, and in many cases to outperform more sophisticated MOEAs on MOCOPs. Yet, it remains unclear whether such findings hold for large-scale cases. In this paper, we conduct an empirical investigation into the scalability of MOEAs on combinatorial problems, with problem size from 50 to 5,000. Our results show that SEMO experiences a decline in convergence speed as dimensionality increases, compared to other MOEAs such as NSGA-II, SMS-EMOA and MOEA/D. We further demonstrate that the absence of crossover is a major contributor to SEMO's underperformance in large-scale problems, and that incorporating crossover into SEMO can substantially accelerate convergence in general, despite being detrimental in spreading solutions over the Pareto front.
Monotone Boolean functions are a structurally important class of Boolean functions, but their restricted form imposes strong limitations on achievable nonlinearity. In this paper, we investigate whether evolutionary computation can evolve monotone Boolean functions with high nonlinearity, both in the balanced and imbalanced settings. We consider three solution encodings: the standard truth table representation, a balanced truth table encoding that preserves Hamming weight, and a symbolic tree-based genetic programming representation. To guide the search toward monotone increasing functions, we introduce a non-monotonicity penalty and combine it with fitness functions targeting balancedness and nonlinearity. Experimental results are reported for dimensions from $n=5$ to $n=14$. The results show that evolutionary search can discover monotone Boolean functions with nonlinearities clearly exceeding those of majority functions, and in several cases approaching the best currently known values for monotone functions. At the same time, the experiments reveal substantial differences between encodings: the balanced truth table encoding performs poorly for larger dimensions, while the standard truth table and genetic programming encodings remain competitive, with genetic programming becoming especially relevant in the largest tested dimensions.
Due to the increasing frequency and severity of storm events, driven by the escalation of anthropogenic climate change and urban expansion, there is a requirement for increasingly efficient flood risk management strategies. While Blue-Green Infrastructure (BGI) offers a sustainable solution for managing flood risk, optimal implementation is challenging. To help overcome this challenge, this study presents a novel multi-objective optimisation tool that couples a state-of-the-art hydrodynamic model with a bespoke evolutionary algorithm.
The use of a fully dynamic hydrodynamic model enables the tool to accurately evaluate the effectiveness of proposed BGI features with respect to property scale flood vulnerability and hazard analysis. This contrasts with alternative approaches which utilise simplified models, which can only reliably predict inundation extents, thus the proposed optimisation tool provides greater certainty regarding the optimality of the solutions. As a hydrodynamic simulation is required to evaluate each candidate solution, the bespoke evolutionary algorithm is specifically designed to minimise the number of simulations required, ensuring the tool is computationally practical. The effectiveness of the tool in this regard is validated via the derivation of exact convergence measures, for a tractable search space, and via comparisons with benchmark algorithms, for an intractable search space.
Compared with traditional design practices, the proposed tool offers an automated approach capable of efficiently exploring a wide range of solutions, providing decision-makers with a set of optimal solutions from which they can make informed investment decisions. The presented methods provide a robust framework for optimising a variety of BGI features in complex urban environments.
Spiking neural networks (SNNs) are rapidly gaining momentum as an alternative to conventional artificial neural networks in resource constrained edge systems. In this work, we continue a recent research line on recurrent SNNs where axonal delays are learned at runtime along with the other network parameters. The first proposed approach, dubbed DelRec, demonstrated the benefit of recurrent delay learning in SNNs. Here, we extend it by advocating the use of convolutional recurrent connections in conjunction with the DelRec delay learning mechanism. According to our tests on an audio classification task, this leads to a streamlined architecture with smaller memory footprint (around 99% savings in terms of number of recurrent parameters) and a much faster (52x) inference time, while retaining DelRec's accuracy. Our code is available at: https://github.com/luciozebendo/delrec_snn/tree/conv_delays
Designing optimizers that remain effective under tight evaluation budgets is critical in expensive black-box settings such as cardiac digital twinning. We propose Frenetic Cat-inspired Particle Optimization (FCPO), a hybrid swarm method that couples particle swarm optimization-like dynamics with an explicit-state Markov switching controller to schedule exploration and refinement operators online. FCPO integrates (i) state-conditioned bounded motion, (ii) an elite-difference global jump operator to escape stagnation, (iii) eigen-space guided local refinement from elite covariance, and (iv) linear population size reduction to control late-stage computational cost. We benchmark FCPO on five representative functions from the Congress on Evolutionary Computation (CEC) 2022 suite (F1, F2, F3, F6 and F10) at dimensions D$\in${10,20} over 30 independent runs, comparing against PSO, CSO, CLPSO, SHADE, L-SHADE and CMA-ES. FCPO achieves the lowest mean runtime across the ten benchmark cases (average 0.183 s), about 2.3x faster than CMA-ES and 2.6x faster than L-SHADE in our Python implementation. On the multimodal composition function F10 at D=20, FCPO attains the best mean objective (9.625x 10^2 $\pm$ 1.275x 10^3) and remains faster than CMA-ES (0.602 s vs. 1.126 s mean runtime). On structured landscapes (F1--F3) and on the hybrid function (F6), CMA-ES remains the most accurate method, while FCPO substantially improves over classical swarms and maintains a favorable accuracy--runtime trade-off. Finally, in a ventricular activation digital twin calibration task, FCPO reaches the target electrocardiogram (ECG) fidelity (RMSE<0.1 mV) within ~ 40 iterations and produces physiologically plausible activation maps with robust convergence across repeated initializations, supporting its use as a practical optimizer for expensive inverse problems.
Experimental evidence indicates that intrinsic temporal dynamics operating across multiple time scales are closely associated with the emergence of periodic spatial activity of increasing complexity. However, how information encoded in grid-like firing patterns for path integration is processed across these intrinsic time scales remains unclear. To address this question, we introduce adaptive time scales through a leak term in recurrent neural networks (RNNs), forming leaky RNNs discretized from the continuous attractors of firing rate models. Our results demonstrate that leaky RNNs substantially enhance the emergence of well-defined and highly regular hexagonal firing patterns. Compared with vanilla RNNs lacking a leak term, the trained leaky RNNs produce more accurate position estimates while generating reliable grid-cell-like representations. Furthermore, under identical noise conditions, leaky RNNs consistently exhibit more stable dynamics and better-defined grid structures. The learned dynamics also give rise to stable torus attractors with a clear central hole, supporting robust and regular grid-like activity. Overall, the dynamic leak acts as a low-pass filtering mechanism that protects recurrent neural circuitry from noise, stabilizes network dynamics, and improves path-integration accuracy in recurrent neural networks.
Always-on converter health monitoring demands sub-mW edge inference, a regime inaccessible to GPU-based physics-informed neural networks. This work separates spiking temporal processing from physics enforcement: a three-layer leaky integrate-and-fire SNN estimates passive component parameters while a differentiable ODE solver provides physics-consistent training by decoupling the ODE physics loss from the unrolled spiking loop. On an EMI-corrupted synchronous buck converter benchmark, the SNN reduces lumped resistance error from $25.8\%$ to $10.2\%$ versus a feedforward baseline, within the $\pm 10\%$ manufacturing tolerance of passive components, at a projected ${\sim}270\times$ energy reduction on neuromorphic hardware. Persistent membrane states further enable degradation tracking and event-driven fault detection via a $+5.5$ percentage-point spike-rate jump at abrupt faults. With $93\%$ spike sparsity, the architecture is suited for always-on deployment on Intel Loihi 2 or BrainChip Akida.
Simulation from one stem cell produces 85 densely linked neurons reaching over 90 percent accuracy after one training pass on standard image
abstractclick to expand
This work simulates the developmental process of cortical neurogenesis, initiating from a single stem cell and governed by gene regulatory rules derived from mouse single-cell transcriptomic data. The developmental process spontaneously generates a heterogeneous population of 5,000 cells, yet yields only 85 mature neurons - merely 1.7% of the total population. These 85 neurons form a densely interconnected core of 200,400 synapses, corresponding to an average degree of 4,715 per neuron. At iteration zero, this minimal circuit performs at chance level on MNIST. However, after a single epoch of standard training, accuracy surges to over 90% - a gain exceeding 80 percentage points - with typical runs falling in the 89-94% range depending on developmental stochasticity. The identical circuit, without any architectural modification or data augmentation, achieves 40.53% on CIFAR-10 after one epoch. These findings demonstrate that developmental rules sculpt a domain-general topological substrate exceptionally amenable to rapid learning, suggesting that biological developmental processes inherently encode powerful structural priors for efficient computation.
Pareto optimization via evolutionary multi-objective algorithms has been shown to efficiently solve constrained monotone submodular functions. Traditionally when solving multiple problems, the algorithm is run for each problem separately. We introduce multitasking formulations of these problems that are an effective way to solve multiple related problems with a single run. In our setting the given problems share a monotone submodular function $f$ but have different knapsack constraints. We examine the case where elements within a constraint have the same cost and show that our multitasking formulations result in small Pareto fronts. This allows the population to share solutions between all problems leading to significant improvements compared to running several classical approaches independently. Using rigorous runtime analysis, we analyze the expected time until the introduced multitasking approaches obtain a $(1-1/e)$-approximation for each of the given problems. Our experimental investigations for the maximum coverage problem give further insight into the dynamics behind how the approach works and doesn't work in practice for problems where elements within a constraint also have varied costs.