Stabilizing Equilibrium Models by Jacobian Regularization

J. Zico Kolter; Shaojie Bai; Vladlen Koltun

arxiv: 2106.14342 · v1 · pith:SJ7Q4MUMnew · submitted 2021-06-28 · 💻 cs.LG · stat.ML

Stabilizing Equilibrium Models by Jacobian Regularization

Shaojie Bai , Vladlen Koltun , J. Zico Kolter This is my paper

classification 💻 cs.LG stat.ML

keywords modelsdeepequilibriumnetworksregularizationarchitecturaldeqsfixed-point

0 comments

read the original abstract

Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer. These models have been shown to achieve performance competitive with the state-of-the-art deep networks while using significantly less memory. Yet they are also slower, brittle to architectural choices, and introduce potential instability to the model. In this paper, we propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models. We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains (e.g., WikiText-103 language modeling and ImageNet classification). Using this method, we demonstrate, for the first time, an implicit-depth model that runs with approximately the same speed and level of performance as popular conventional deep networks such as ResNet-101, while still maintaining the constant memory footprint and architectural simplicity of DEQs. Code is available at https://github.com/locuslab/deq .

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models
cs.LG 2026-04 unverdicted novelty 6.0

Denoising Recursion Models train multi-step noise reversal in looped transformers and outperform the prior Tiny Recursion Model on ARC-AGI.
Stabilizing Recurrent Dynamics for Test-Time Scalable Latent Reasoning in Looped Language Models
cs.LG 2026-05 unverdicted novelty 5.0

STARS trains looped language models with Jacobian spectral radius regularization and random loop sampling to drive latent states toward asymptotically stable fixed points, yielding reliable test-time scaling on arithm...