pith. sign in

hub Mixed citations

Title resolution pending

Mixed citation behavior. Most common role is background (62%).

34 Pith papers citing it
Background 62% of classified citations
abstract

This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an abstraction that is similar to what is found in nature: the relationship between a genotype - the hypernetwork - and a phenotype - the main network. Though they are also reminiscent of HyperNEAT in evolution, our hypernetworks are trained end-to-end with backpropagation and thus are usually faster. The focus of this work is to make hypernetworks useful for deep convolutional networks and long recurrent networks, where hypernetworks can be viewed as relaxed form of weight-sharing across layers. Our main result is that hypernetworks can generate non-shared weights for LSTM and achieve near state-of-the-art results on a variety of sequence modelling tasks including character-level language modelling, handwriting generation and neural machine translation, challenging the weight-sharing paradigm for recurrent networks. Our results also show that hypernetworks applied to convolutional networks still achieve respectable results for image recognition tasks compared to state-of-the-art baseline models while requiring fewer learnable parameters.

hub tools

citation-role summary

background 8 method 5

citation-polarity summary

representative citing papers

Neural Ordinary Differential Equations

cs.LG · 2018-06-19 · accept · novelty 8.0

Neural networks are redefined as continuous dynamical systems by learning the derivative of the hidden state with a neural network and integrating it with an ODE solver.

UniReg: A Universal Model for Controllable CT Image Registration

cs.CV · 2025-03-17 · unverdicted · novelty 7.0

UniReg introduces a conditional unified neural model for multi-scenario CT registration that conditions on anatomical priors, inter/intra-subject type, and instance features to achieve higher accuracy and cross-scenario generalization than task-specific networks.

Searching for Activation Functions

cs.NE · 2017-10-16 · conditional · novelty 7.0

Automated search discovers Swish activation f(x) = x * sigmoid(βx) that improves top-1 ImageNet accuracy over ReLU by 0.9% on Mobile NASNet-A and 0.6% on Inception-ResNet-v2.

citing papers explorer

Showing 34 of 34 citing papers.