Instance Normalization: The Missing Ingredient for Fast Stylization

Andrea Vedaldi, Dmitry Ulyanov, Victor Lempitsky

Authors on Pith no claims yet

classification 💻 cs.CV

keywords normalizationstylizationchangefastgithubinstancemethodapply

read the original abstract

It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change in the stylization architecture results in a significant qualitative improvement in the generated images. The change is limited to swapping batch normalization with instance normalization, and to apply the latter both at training and testing times. The resulting method can be used to train high-performance architectures for real-time image generation. The code will is made available on github at https://github.com/DmitryUlyanov/texture_nets. Full paper can be found at arXiv:1701.02096.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 16 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

QuadNorm: Resolution-Robust Normalization for Neural Operators
cs.LG 2026-05 unverdicted novelty 7.0

QuadNorm uses quadrature-based moments instead of uniform averaging in normalization layers, achieving O(h²) consistency across resolutions and better cross-resolution transfer in neural operators.
Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity
stat.ML 2026-05 unverdicted novelty 7.0

Every fixed finite feedforward neural network definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting.
Normalization Equivariance for Arbitrary Backbones, with Application to Image Denoising
cs.CV 2026-05 unverdicted novelty 7.0

A normalize-process-denormalize wrapper enforces normalization equivariance on arbitrary backbones, improving robustness to distribution shift in image denoising with no overhead.
StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition
cs.GR 2026-04 unverdicted novelty 7.0

StyleID supplies human-perception-aligned benchmarks and fine-tuned encoders that improve facial identity recognition robustness across stylization types and strengths.
High-Speed Full-Color HDR Imaging via Unwrapping Modulo-Encoded Spike Streams
cs.CV 2026-04 unverdicted novelty 7.0

An exposure-decoupled modulo formulation and iteration-free diffusion-prior unwrapping enable 1000 FPS full-color HDR imaging on spike cameras while cutting bandwidth from 20 Gbps to 6 Gbps.
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
cs.LG 2022-11 conditional novelty 7.0

PatchTST uses subseries patching and channel-independent Transformers to deliver significantly better long-term multivariate time series forecasting and strong self-supervised transfer performance.
Rethinking Constraint Awareness for Efficient State Embedding of Neural Routing Solver
cs.AI 2026-05 unverdicted novelty 6.0

The CARM module boosts neural routing solvers by adaptively modulating embeddings with constraint variables, enabling better use of global observations and improved performance on constrained VRPs.
Linearizing Vision Transformer with Test-Time Training
cs.CV 2026-05 unverdicted novelty 6.0

Using Test-Time Training's structural match to Softmax attention plus key normalization and locality modules allows inheriting pretrained weights and fine-tuning Stable Diffusion 3.5 in one hour to match quality while...
Are Natural-Domain Foundation Models Effective for Accelerated Cardiac MRI Reconstruction?
eess.IV 2026-04 unverdicted novelty 6.0

Natural-domain foundation models provide competitive and more robust priors than task-specific models for accelerated cardiac MRI reconstruction in cross-domain settings.
A fast and Generic Energy-Shifting Transformer for Hybrid Monte Carlo Radiotherapy Calculation
physics.med-ph 2026-04 unverdicted novelty 6.0

A hybrid Transformer-UNet model with energy-shifting inputs generates 6 MV LINAC dose maps from monoenergetic data, achieving over 98% gamma passing rate (3%/3mm) versus full Monte Carlo for prostate radiotherapy.
Time-Domain Voice Identity Morphing (TD-VIM): A Signal-Level Approach to Morphing Attacks on Speaker Verification Systems
cs.SD 2026-04 unverdicted novelty 6.0

TD-VIM creates signal-level morphed voice samples that achieve G-MAP attack success rates up to 99.74% against deep-learning and commercial speaker verification systems.
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
cs.LG 2021-04 accept novelty 6.0

Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.
USEMA: a Scalable Efficient Mamba Like Attention for Medical Image Segmentation
cs.CV 2026-05 unverdicted novelty 5.0

USEMA is a hybrid UNet architecture merging CNNs with scalable Mamba-like attention (SEMA) that achieves better efficiency than transformers and superior segmentation accuracy than pure CNN or Mamba models across medi...
Style-Based Neural Architectures for Real-Time Weather Classification
cs.CV 2026-04 unverdicted novelty 5.0

Three style-based neural architectures are proposed for real-time weather classification from images, with two truncated ResNet variants claimed to outperform prior methods and generalize across public datasets.
Reversible Residual Normalization Alleviates Spatio-Temporal Distribution Shift
cs.LG 2026-04 unverdicted novelty 5.0

Reversible Residual Normalization (RRN) introduces spatially-aware invertible residual blocks that combine center normalization with spectral-constrained graph convolutions to mitigate spatio-temporal distribution shi...
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence
cs.LG 2026-04 unverdicted novelty 4.0

A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.