pith. machine review for the scientific record. sign in

arxiv: 1708.05344 · v1 · submitted 2017-08-17 · 💻 cs.LG

Recognition: unknown

SMASH: One-Shot Model Architecture Search through HyperNetworks

Authors on Pith no claims yet
classification 💻 cs.LG
keywords architecturemodelnetworkssearchsmasharchitecturesperformancerange
0
0 comments X
read the original abstract

Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at https://github.com/ajbrock/SMASH

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. RELO: Reinforcement Learning to Localize for Visual Object Tracking

    cs.CV 2026-05 unverdicted novelty 6.0

    RELO replaces handcrafted spatial priors with a reinforcement learning policy for target localization in visual tracking and reports 57.5% AUC on LaSOText without template updates.

  2. Representation-Aligned Multi-Scale Personalization for Federated Learning

    cs.LG 2026-04 unverdicted novelty 5.0

    FRAMP generates client-specific models from compact descriptors in federated learning, trains tailored submodels, and aligns representations to balance personalization with global consistency.