Moonwalk enables memory-efficient training of deep networks via mixed-mode gradient computation with vector-inverse-Jacobian products for submersive layers and fragmental checkpointing otherwise, matching backprop runtime at over twice the depth.
A learning algorithm for continually running fully recurrent neural networks
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
method 1polarities
use method 1representative citing papers
Stochastic causal representation learning with sMMD resolves the bias-precision paradox, improving accuracy under distribution shift by up to 11.5% and clinician accuracy by 14.7% on large ICU cohorts.
An offline-trained controller augments autoregressive diffusion models to perform fast, feed-forward data assimilation in chaotic spatiotemporal PDEs with order-of-magnitude speedups and improved accuracy over baselines.
Hierarchical seq2seq model for parallel voice conversion pretrained as autoencoder on single-speaker data then adapted to limited multispeaker data, using mel spectrograms converted via wavenet vocoder.
citing papers explorer
-
Moonwalk: Inverse-Forward Differentiation
Moonwalk enables memory-efficient training of deep networks via mixed-mode gradient computation with vector-inverse-Jacobian products for submersive layers and fragmental checkpointing otherwise, matching backprop runtime at over twice the depth.
-
Resolving the bias-precision paradox with stochastic causal representation learning for personalized medicine
Stochastic causal representation learning with sMMD resolves the bias-precision paradox, improving accuracy under distribution shift by up to 11.5% and clinician accuracy by 14.7% on large ICU cohorts.
-
Control-Augmented Autoregressive Diffusion for Data Assimilation
An offline-trained controller augments autoregressive diffusion models to perform fast, feed-forward data assimilation in chaotic spatiotemporal PDEs with order-of-magnitude speedups and improved accuracy over baselines.
-
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
Hierarchical seq2seq model for parallel voice conversion pretrained as autoencoder on single-speaker data then adapted to limited multispeaker data, using mel spectrograms converted via wavenet vocoder.