Neural Architecture Search with Reinforcement Learning
read the original abstract
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.
This paper has not been read by Pith yet.
Forward citations
Cited by 45 Pith papers
-
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
AutoTTS discovers width-depth test-time scaling controllers through agentic search in a pre-collected trajectory environment, yielding better accuracy-cost tradeoffs than hand-designed baselines on math reasoning task...
-
AGAN: Towards Automated Design of Generative Adversarial Networks
AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
-
1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
-
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
AutoTTS discovers superior test-time scaling strategies for LLMs via cheap controller synthesis in a pre-collected trajectory environment, outperforming manual baselines on math benchmarks with low discovery cost.
-
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
AutoSOTA uses eight specialized agents to replicate and optimize models from recent AI papers, producing 105 new SOTA results in about five hours per paper on average.
-
Neural Architecture Search of Time-to-First-Spike-Coded Spiking Neural Networks for Efficient Eye-based Emotion Recognition
TNAS-ER uses an ANN-assisted evolutionary search to optimize TTFS SNN architectures, achieving high emotion recognition performance with improved energy efficiency on neuromorphic hardware.
-
Soft Head Selection for Injecting ICL-Derived Task Embeddings
SITE applies soft gradient-based head selection to inject ICL-derived task embeddings, outperforming prior embedding adaptation and few-shot ICL across generation, reasoning, and NLU tasks on 12 LLMs from 4B to 70B pa...
-
COCO-Inpaint: A Benchmark for Detecting and Localizing Inpainting-Based Image Manipulations
COCO-Inpaint supplies a large-scale dataset and evaluation protocol focused on inpainting-based image forgeries to benchmark existing detection methods.
-
Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
Experimental comparison of 15 HPO and NAS algorithms for automated feature preprocessing on 45 tabular datasets finds evolution-based methods and random search as top performers.
-
Learning to learn with quantum neural networks via classical neural networks
Classical RNNs trained on small instances provide parameter initializations for QAOA and VQE that reduce total optimization iterations and generalize across problem sizes.
-
Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks
AMEAN applies adversarial meta-learning to discover implicit meta-sub-target clusters in blended target data, reducing intra-target category misalignment and outperforming standard DA methods on three BTDA benchmarks.
-
Neural Network Architecture Search with Differentiable Cartesian Genetic Programming for Regression
dCGPANN encodes neural nets so evolutionary operators can rewire, prune, adapt activations and add skips while gradient descent tunes parameters, yielding smaller networks with lower regression error in fixed time.
-
NetTailor: Tuning the Architecture, Not Just the Weights
NetTailor adapts CNN architecture for new tasks by assembling pre-trained universal blocks with task-specific layers, trained via activation mimicry and complexity penalties to match accuracy while reducing size for s...
-
Surrogate Neural Architecture Codesign Package (SNAC-Pack)
SNAC-Pack automates hardware-aware neural architecture codesign for FPGAs via surrogate-based multi-objective search, QAT/pruning compression, and hls4ml synthesis, yielding compact models with reduced resources on je...
-
RELO: Reinforcement Learning to Localize for Visual Object Tracking
RELO formulates visual object tracking localization as a Markov decision process solved by reinforcement learning with combined IoU and AUC rewards, augmented by layer-aligned temporal token propagation, and reports 5...
-
RELO: Reinforcement Learning to Localize for Visual Object Tracking
RELO replaces handcrafted spatial priors with a reinforcement learning policy for target localization in visual tracking and reports 57.5% AUC on LaSOText without template updates.
-
OMEGA: Optimizing Machine Learning by Evaluating Generated Algorithms
OMEGA framework generates novel ML classifiers via meta-prompts and executable code that outperform scikit-learn baselines on 20 benchmark datasets.
-
TRON: Trainable, architecture-reconfigurable random optical neural networks
TRON demonstrates a trainable and reconfigurable optical neural network that combines multi-scattering media with DMD-based matrix multiplication and performs in-situ optimization plus neural architecture search on th...
-
DeepFedNAS: Efficient Hardware-Aware Architecture Adaptation for Heterogeneous IoT Federations via Pareto-Guided Supernet Training
DeepFedNAS delivers up to 1.21% higher accuracy and 61x faster architecture search for federated learning on heterogeneous IoT by replacing random supernet sampling with Pareto-optimal elite architectures and using a ...
-
LLaVA-Video: Video Instruction Tuning With Synthetic Data
LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.
-
AutoPV: Automatically Design Your Photovoltaic Power Forecasting Model
AutoPV applies neural architecture search with a custom search space drawn from time series forecasting and photovoltaic models to automatically produce architectures that outperform predefined state-of-the-art models...
-
Learnable Parameter Similarity
LPS uses a second-order neural network to learn an end-to-end metric for second-order parameter similarity and introduces the ModelSet500 benchmark with 500 trained models.
-
Video Action Recognition Via Neural Architecture Searching
Uses differentiable NAS with temporal segments and pseudo-3D operators to discover a video action recognition network that outperforms hand-designed models on UCF101 with ~1% of the parameters when trained from scratch.
-
Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators
Sibyl-AutoResearch introduces self-evolving trial-and-error harnesses with auditable conversion units that link trial signals to updated research behaviors and harness repairs in autonomous systems.
-
Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria
Arbitrary heterogeneous fan-in profiles in sparse networks match uniform random accuracy at high sparsity, but initializing RigL dynamic sparse training with equilibrium-matched lognormal profiles improves performance...
-
From LLM to Silicon: RL-Driven ASIC Architecture Exploration for On-Device AI Inference
An RL agent using Soft Actor-Critic with Mixture-of-Experts jointly optimizes ASIC architecture, memory hierarchy, and partitioning for AI inference, achieving 29809 tokens/s for Llama 3.1 at 3nm and under 13mW for Sm...
-
Efficient Accelerated Graph Edit Distance Computation on GPU
FAST-GED delivers orders-of-magnitude speedups over NetworkX for graph edit distance on GPUs while often reaching optimal solutions and outperforming approximate methods.
-
Optimized Architectures for Kolmogorov-Arnold Networks
Overprovisioned KANs with sparsification, deep supervision, and depth selection under differentiable MDL yield smaller models with competitive accuracy on benchmarks.
-
CoLLM-NAS: Collaborative Large Language Models for Efficient Knowledge-Guided Neural Architecture Search
CoLLM-NAS introduces a collaborative two-LLM framework with Navigator, Generator, and Coordinator modules to perform knowledge-guided neural architecture search, reporting state-of-the-art results on ImageNet and NAS-...
-
Bridging the phenotype-target gap for molecular generation via multi-objective reinforcement learning
SmilesGEN uses dual VAEs to jointly model drug structures and transcriptional responses, generating molecules with higher validity, novelty, and similarity to known ligands than prior methods.
-
Implantable Adaptive Cells: A Novel Enhancement for Pre-Trained U-Nets in Medical Image Segmentation
Introduces Implantable Adaptive Cells inserted into pre-trained U-Nets via Partially-Connected DARTS to achieve approximately 5 percentage point gains in segmentation accuracy on four medical MRI/CT datasets.
-
EPNAS: Efficient Progressive Neural Architecture Search
EPNAS uses a progressive search policy with REINFORCE performance prediction to search neural architectures in parallel, supporting multiple resource constraints and outperforming ENAS and PNAS on CIFAR-10 and ImageNe...
-
ARMIN: Towards a More Efficient and Light-weight Recurrent Memory Network
ARMIN introduces auto-addressing via hidden states and a novel RNN cell to produce a lighter recurrent memory network with lower overhead than existing MANNs or vanilla LSTMs.
-
Learning to Cope with Adversarial Attacks
MLAH agent in deep RL demonstrates hierarchical coping mechanisms and improved reward maintenance under spaced adversarial attacks, at the expense of stability.
-
Hyp-RL : Hyperparameter Optimization by Reinforcement Learning
Reinforcement learning selects hyperparameters sequentially by learning from actual future validation loss reductions and outperforms SMBO methods on 50 datasets.
-
BayMOTH: Bayesian optiMizatiOn with meTa-lookahead -- a simple approacH
BayMOTH unifies meta-Bayesian optimization with a usefulness-based fallback to lookahead, demonstrating competitive results on function optimization tasks even under low task relatedness.
-
Exploring Vision Neural Network Pruning via Screening Methodology
A unified F-statistic screening and weighted evaluation method prunes both unstructured and structured parameters in FNNs and CNNs, claiming order-of-magnitude size reduction with competitive accuracy on vision datasets.
-
Training speedups via batching for geometric learning: an analysis of static and dynamic algorithms
Experiments on QM9 and AFLOW datasets show that static and dynamic batching for GNNs can yield up to 2.7x training speedups depending on data, model, batch size, hardware, and training steps, with occasional differenc...
-
Self-Adaptive 2D-3D Ensemble of Fully Convolutional Networks for Medical Image Segmentation
Self-adaptive 2D-3D FCN ensemble optimized by multiobjective evolution for prostate segmentation on PROMISE12 achieves top-10 ranking with smaller size than prior auto-designed models.
-
MLFriend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data
MLFriend enumerates prediction tasks for event-driven time-series data and interactively recommends useful ones, with evaluation on three datasets yielding 2885 tasks of which 722 were deemed useful by experts.
-
Split and Aggregation Learning for Foundation Models Over Mobile Embodied AI Network (MEAN): A Comprehensive Survey
The paper surveys split and aggregation learning for foundation models in 6G networks to improve efficiency, resource use, and data privacy in distributed AI.
-
Genetic Deep Learning for Lung Cancer Screening
Genetic algorithm designs a CNN for lung cancer detection in CXRs achieving 97.15% accuracy, outperforming Inception-V3 and ResNet-152 with 4x and 14x fewer parameters.
-
A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera
A multitask framework lifts 2D keypoints to 3D poses via a two-stream network then applies ENAS to model spatio-temporal pose evolution for action recognition on Human3.6M, MSR Action3D and SBU datasets.
-
Genetic Network Architecture Search
Genetic algorithm searches convolution cell architectures with weight sharing via SGD, reporting 96% accuracy on CIFAR10 and 80.1% on CIFAR100.
-
Spiking Neural Network Architecture Search: A Survey
A survey of Spiking Neural Network architecture search techniques viewed through a hardware/software co-design lens.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.