AttentionBender applies 2D transforms to cross-attention maps in video diffusion transformers, producing distributed distortions and glitch aesthetics that reveal entangled attention mechanisms while serving as both an XAI probe and creative tool.
Title resolution pending
13 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
Measurement-based quantum diffusion models are introduced to recover pure and mixed quantum states via weak measurements, quantum score matching, and Petz recovery maps with error bounds, bridging to classical stochastic reversals.
COCO-Inpaint supplies a large-scale dataset and evaluation protocol focused on inpainting-based image forgeries to benchmark existing detection methods.
GenSBI delivers JAX-native implementations of generative SBI methods with transformer backbones and reports near-ideal calibration scores on standard benchmarks.
A homodyne photonic tensor processor using TFLN transmitters and Si/SiN circuits demonstrates 1,000-6,000 TOPS throughput with 6-7 bit accuracy at up to 120 Gbaud/s clock rates.
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
A plug-in estimator for tilted distributions is minimax-optimal, with Wasserstein closeness bounds to the true tilted distribution and TV-accuracy guarantees when running diffusion on the estimated samples.
Score-based diffusion models learn the empirical distribution of real LIGO noise to enable unbiased gravitational-wave parameter estimation under only an additivity assumption.
FedGMI applies VAEs as density estimators in federated learning to infer mixture proportions of shared distributions for structured personalization under data heterogeneity.
I2VGen-XL applies cascaded diffusion models with a base stage for semantic preservation via hierarchical encoders and a refinement stage for detail and resolution, trained on 35 million text-video and 6 billion text-image pairs.
Diffusion, score-based, and flow matching models are unified as instances of learning time-dependent vector fields inducing marginal distributions governed by continuity and Fokker-Planck equations.
ModelScopeT2V is a 1.7-billion-parameter text-to-video model built on Stable Diffusion that adds temporal modeling and outperforms prior methods on three evaluation metrics.
citing papers explorer
-
AttentionBender: Manipulating Cross-Attention in Video Diffusion Transformers as a Creative Probe
AttentionBender applies 2D transforms to cross-attention maps in video diffusion transformers, producing distributed distortions and glitch aesthetics that reveal entangled attention mechanisms while serving as both an XAI probe and creative tool.
-
Measurement-Based Quantum Diffusion Models
Measurement-based quantum diffusion models are introduced to recover pure and mixed quantum states via weak measurements, quantum score matching, and Petz recovery maps with error bounds, bridging to classical stochastic reversals.
-
COCO-Inpaint: A Benchmark for Detecting and Localizing Inpainting-Based Image Manipulations
COCO-Inpaint supplies a large-scale dataset and evaluation protocol focused on inpainting-based image forgeries to benchmark existing detection methods.
-
GenSBI: Generative Methods for Simulation-Based Inference in JAX
GenSBI delivers JAX-native implementations of generative SBI methods with transformer backbones and reports near-ideal calibration scores on standard benchmarks.
-
Homodyne Photonic Tensor Processor exceeds 1,000-TOPS
A homodyne photonic tensor processor using TFLN transmitters and Si/SiN circuits demonstrates 1,000-6,000 TOPS throughput with 6-7 bit accuracy at up to 120 Gbaud/s clock rates.
-
Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
-
Generating DDPM-based Samples from Tilted Distributions
A plug-in estimator for tilted distributions is minimax-optimal, with Wasserstein closeness bounds to the true tilted distribution and TV-accuracy guarantees when running diffusion on the estimated samples.
-
Gravitational-Wave Parameter Estimation in non-Gaussian noise using Score-Based Likelihood Characterization
Score-based diffusion models learn the empirical distribution of real LIGO noise to enable unbiased gravitational-wave parameter estimation under only an additivity assumption.
-
FedGMI: Generative Model-Driven Federated Learning for Probabilistic Mixture Inference
FedGMI applies VAEs as density estimators in federated learning to infer mixture proportions of shared distributions for structured personalization under data heterogeneity.
-
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
I2VGen-XL applies cascaded diffusion models with a base stage for semantic preservation via hierarchical encoders and a refinement stage for detail and resolution, trained on 35 million text-video and 6 billion text-image pairs.
-
A Unified Measure-Theoretic View of Diffusion, Score-Based, and Flow Matching Generative Models
Diffusion, score-based, and flow matching models are unified as instances of learning time-dependent vector fields inducing marginal distributions governed by continuity and Fokker-Planck equations.
-
ModelScope Text-to-Video Technical Report
ModelScopeT2V is a 1.7-billion-parameter text-to-video model built on Stable Diffusion that adds temporal modeling and outperforms prior methods on three evaluation metrics.
- LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention