Aligning noisy hidden states in diffusion transformers to clean features from pretrained visual encoders speeds up training over 17x and reaches FID 1.42.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
A Person Independence Universal Micro-action Recognition Framework combines Distributionally Robust Optimization with temporal-frequency alignment at the feature level and group-invariant regularization at the loss level to improve generalization across persons on the MA-52 dataset.
citing papers explorer
-
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Aligning noisy hidden states in diffusion transformers to clean features from pretrained visual encoders speeds up training over 17x and reaches FID 1.42.
-
Every Subtlety Counts: Fine-grained Person Independence Micro-Action Recognition via Distributionally Robust Optimization
A Person Independence Universal Micro-action Recognition Framework combines Distributionally Robust Optimization with temporal-frequency alignment at the feature level and group-invariant regularization at the loss level to improve generalization across persons on the MA-52 dataset.