DARTS: Differentiable Architecture Search

Hanxiao Liu , Karen Simonyan , Yiming Yang

Authors on Pith no claims yet

classification 💻 cs.LG cs.CLcs.CVstat.ML

keywords architecturesearcharchitecturesdifferentiableefficientnon-differentiableaddressesalgorithm

read the original abstract

This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Lattice fermion formulation via Physics-Informed Neural Networks: Ginsparg-Wilson relation and Overlap fermions
hep-lat 2026-05 unverdicted novelty 7.0

Physics-informed neural networks construct overlap fermions by optimizing to the Ginsparg-Wilson relation and autonomously discover both the standard and generalized Fujikawa-type versions of the relation.
Lattice fermion formulation via Physics-Informed Neural Networks: Ginsparg-Wilson relation and Overlap fermions
hep-lat 2026-05 unverdicted novelty 7.0

Physics-Informed Neural Networks construct lattice Dirac operators satisfying the Ginsparg-Wilson relation, reproducing overlap fermions to high accuracy and discovering a Fujikawa-type generalized relation via algebr...
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
cs.CL 2026-04 unverdicted novelty 7.0

AutoSOTA uses eight specialized agents to replicate and optimize models from recent AI papers, producing 105 new SOTA results in about five hours per paper on average.
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts
cs.CL 2026-05 unverdicted novelty 6.0

PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
CHAL: Council of Hierarchical Agentic Language
cs.AI 2026-05 unverdicted novelty 6.0

CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.
SURGE: Surrogate Gradient Adaptation in Binary Neural Networks
cs.LG 2026-05 unverdicted novelty 6.0

SURGE proposes a dual-path gradient compensator and adaptive scaler to learn better surrogate gradients for binary neural network training, outperforming prior methods on classification, detection, and language tasks.
Taming Asynchronous CPU-GPU Coupling for Frequency-aware Latency Estimation on Mobile Edge
cs.AR 2026-04 unverdicted novelty 6.0

FLAME models layer-wise overlapping parallelism and asynchronous CPU-GPU pipeline bubbles to estimate inference latency across frequencies with sparse profiling and low error for DNNs and SLMs.
LLaVA-Video: Video Instruction Tuning With Synthetic Data
cs.CV 2024-10 unverdicted novelty 6.0

LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.
End-to-end Automated Deep Neural Network Optimization for PPG-based Blood Pressure Estimation on Wearables
cs.LG 2026-04 unverdicted novelty 5.0

An end-to-end hardware-aware optimization pipeline produces DNNs for PPG-based blood pressure estimation with up to 7.99% lower error and 83x fewer parameters that fit on ultra-low-power SoCs like GAP8.
Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI
cs.CV 2026-04 unverdicted novelty 4.0

Deployment-aligned low-precision NAS recovers about two-thirds of the accuracy drop from post-training quantization, achieving 0.826 mIoU on-device for a 95k-parameter model on Intel Movidius Myriad X without added co...