hub

Gans trained by a two time-scale update rule converge to a local nash equilib- rium.Advances in neural information processing systems, 30

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Sepp Hochreiter · 2017

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

browse 15 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

method 2 background 1

citation-polarity summary

use method 2 background 1

representative citing papers

CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

CoReDiT reduces self-attention FLOPs in DiTs by up to 55% via linear-time spatial coherence pruning and neighbor-based reconstruction, delivering 1.33x-1.72x speedups with maintained quality.

Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

C-MET transfers emotions from speech to facial video by learning cross-modal semantic vectors with pretrained audio and disentangled expression encoders, yielding 14% higher emotion accuracy on MEAD and CREMA-D even for unseen emotions.

Banana100: Breaking NR-IQA Metrics by 100 Iterative Image Replications with Nano Banana Pro

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

Banana100 dataset shows that none of 21 popular NR-IQA metrics consistently rate images degraded by 100 iterative edits lower than clean originals.

Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration

cs.CV · 2026-03-31 · unverdicted · novelty 7.0

IQPIR uses NR-IQA-derived quality scores to condition a Transformer and dual-branch codebook for perceptually superior real-world image restoration.

Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration

cs.CV · 2026-03-17 · unverdicted · novelty 7.0

Face2Scene uses facial restoration as an oracle to derive degradation codes that condition a diffusion model for restoring the entire degraded scene.

From Navigation to Refinement: Revealing the Two-Stage Nature of Flow-based Diffusion Models through Oracle Velocity

cs.LG · 2025-12-02 · conditional · novelty 7.0

Flow matching models follow a two-stage process of navigation across data modes then refinement to nearest samples, revealed by exact computation of the oracle marginal velocity field.

LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models

cs.CV · 2026-05-19 · unverdicted · novelty 6.0 · 2 refs

LIFT decomposes distillation into coarse linear alignment then fine refinement while PLACE adds error-based local adaptation, allowing stable training of 1.3M-parameter students (1.6% teacher size) to FID 15.73 across diffusion and flow models.

GOR-IS: 3D Gaussian Object Removal in the Intrinsic Space

cs.CV · 2026-05-01 · unverdicted · novelty 6.0

GOR-IS removes objects from 3D Gaussian Splatting reconstructions by performing inpainting in an intrinsic decomposition space that explicitly models light transport for consistent global lighting and non-Lambertian surfaces.

Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

cs.CV · 2026-04-17 · unverdicted · novelty 6.0

TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.

Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

BVE framework enables text-guided 3D editing beyond voxel limits by combining self-constructed data, lightweight semantic injection, and annotation-free masking to preserve local invariance.

Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression

cs.CV · 2026-04-12 · unverdicted · novelty 6.0

RDVQ enables joint rate-distortion optimization for vector-quantized generative image compression via differentiable codebook distribution relaxation and an autoregressive entropy model.

One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control

cs.CV · 2026-03-18 · unverdicted · novelty 6.0

O2MAG generates high-fidelity text-guided anomalies from a single image without training by manipulating self-attention in diffusion models with anomaly masks and dual enhancements.

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

cs.CV · 2025-11-24 · conditional · novelty 6.0

DeCo decouples high- and low-frequency generation in pixel diffusion via a DiT plus lightweight decoder and a frequency-aware flow-matching loss, reaching FID 1.62 at 256x256 and 2.22 at 512x512 on ImageNet while closing the gap to latent diffusion methods.

Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction

cs.CV · 2026-05-20 · unverdicted · novelty 5.0

A two-stage method predicts an intermediate Canny map for structure then renders the image conditioned on appearance and structure, paired with a 100k text-aware dataset, to improve detail preservation in subject-driven generation.

GaussianZoom: Progressive Zoom-in Generative 3D Gaussian Splatting with Geometric and Semantic Guidance

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

GaussianZoom enables high-fidelity extreme zoom-in 3D rendering from low-res inputs via an iterative framework combining geometry-consistent modeling, depth-based super-resolution, VLM detail synthesis, and an expandable continuous Level-of-Detail hierarchy.

citing papers explorer

Showing 15 of 15 citing papers.

CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers cs.CV · 2026-05-13 · unverdicted · none · ref 10
CoReDiT reduces self-attention FLOPs in DiTs by up to 55% via linear-time spatial coherence pruning and neighbor-based reconstruction, delivering 1.33x-1.72x speedups with maintained quality.
Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video cs.CV · 2026-04-09 · unverdicted · none · ref 22
C-MET transfers emotions from speech to facial video by learning cross-modal semantic vectors with pretrained audio and disentangled expression encoders, yielding 14% higher emotion accuracy on MEAD and CREMA-D even for unseen emotions.
Banana100: Breaking NR-IQA Metrics by 100 Iterative Image Replications with Nano Banana Pro cs.CV · 2026-04-03 · unverdicted · none · ref 26
Banana100 dataset shows that none of 21 popular NR-IQA metrics consistently rate images degraded by 100 iterative edits lower than clean originals.
Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration cs.CV · 2026-03-31 · unverdicted · none · ref 21
IQPIR uses NR-IQA-derived quality scores to condition a Transformer and dual-branch codebook for perceptually superior real-world image restoration.
Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration cs.CV · 2026-03-17 · unverdicted · none · ref 18
Face2Scene uses facial restoration as an oracle to derive degradation codes that condition a diffusion model for restoring the entire degraded scene.
From Navigation to Refinement: Revealing the Two-Stage Nature of Flow-based Diffusion Models through Oracle Velocity cs.LG · 2025-12-02 · conditional · none · ref 15
Flow matching models follow a two-stage process of navigation across data modes then refinement to nearest samples, revealed by exact computation of the oracle marginal velocity field.
LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models cs.CV · 2026-05-19 · unverdicted · none · ref 10 · 2 links
LIFT decomposes distillation into coarse linear alignment then fine refinement while PLACE adds error-based local adaptation, allowing stable training of 1.3M-parameter students (1.6% teacher size) to FID 15.73 across diffusion and flow models.
GOR-IS: 3D Gaussian Object Removal in the Intrinsic Space cs.CV · 2026-05-01 · unverdicted · none · ref 11
GOR-IS removes objects from 3D Gaussian Splatting reconstructions by performing inpainting in an intrinsic decomposition space that explicitly models light transport for consistent global lighting and non-Lambertian surfaces.
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration cs.CV · 2026-04-17 · unverdicted · none · ref 15
TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data cs.CV · 2026-04-15 · unverdicted · none · ref 24
BVE framework enables text-guided 3D editing beyond voxel limits by combining self-constructed data, lightweight semantic injection, and annotation-free masking to preserve local invariance.
Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression cs.CV · 2026-04-12 · unverdicted · none · ref 23
RDVQ enables joint rate-distortion optimization for vector-quantized generative image compression via differentiable codebook distribution relaxation and an autoregressive entropy model.
One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control cs.CV · 2026-03-18 · unverdicted · none · ref 15
O2MAG generates high-fidelity text-guided anomalies from a single image without training by manipulating self-attention in diffusion models with anomaly masks and dual enhancements.
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation cs.CV · 2025-11-24 · conditional · none · ref 17
DeCo decouples high- and low-frequency generation in pixel diffusion via a DiT plus lightweight decoder and a frequency-aware flow-matching loss, reaching FID 1.62 at 256x256 and 2.22 at 512x512 on ImageNet while closing the gap to latent diffusion methods.
Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction cs.CV · 2026-05-20 · unverdicted · none · ref 12
A two-stage method predicts an intermediate Canny map for structure then renders the image conditioned on appearance and structure, paired with a 100k text-aware dataset, to improve detail preservation in subject-driven generation.
GaussianZoom: Progressive Zoom-in Generative 3D Gaussian Splatting with Geometric and Semantic Guidance cs.CV · 2026-05-18 · unverdicted · none · ref 7
GaussianZoom enables high-fidelity extreme zoom-in 3D rendering from low-res inputs via an iterative framework combining geometry-consistent modeling, depth-based super-resolution, VLM detail synthesis, and an expandable continuous Level-of-Detail hierarchy.

Gans trained by a two time-scale update rule converge to a local nash equilib- rium.Advances in neural information processing systems, 30

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer