RobuQ delivers the first stable DiT image generation at W1.58A2 average bits via Hadamard-based robust activation quantization and layer-wise mixed-precision activations.
U-net: Convolutional networks for biomedical image segmentation
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 4representative citing papers
PixArt-α matches commercial text-to-image quality with a diffusion transformer trained in 675 A100 GPU days through decomposed training stages, cross-attention text injection, and vision-language model dense captions.
Dilated affinity is jointly predicted with segmentation labels to strengthen features and support efficient label propagation refinement on benchmark datasets.
Mask embedding in cGANs enables realistic 512x512 face image synthesis guided by semantic masks on the CELEBA-HQ dataset.
citing papers explorer
-
RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization
RobuQ delivers the first stable DiT image generation at W1.58A2 average bits via Hadamard-based robust activation quantization and layer-wise mixed-precision activations.