hub

arXiv preprint arXiv:2511.18822 (2025)

Zhennan Chen, Junwei Zhu, Xu Chen, Jiangning Zhang, Xiaobin Hu, Hanzhen Zhao, Chengjie Wang, Jian Yang, Ying Tai · 2025 · arXiv 2511.18822

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

JAVEDIT: Joint Audio-Visual Instruction-Guided Video Editing with Agentic Data Curation

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

JAVEdit-100k is the first large-scale dataset for instruction-guided joint audio-visual video editing, accompanied by JAVEditBench and the JAVEdit model that outperforms baselines on five of six metrics.

VINS-120K: Ultra High-Resolution Image Editing with A Large-Scale Dataset

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

VINS-120K supplies the first large-scale set of instruction-image-edited-image triplets at ultra-high resolution together with an adaptation strategy that improves detail synthesis.

From Zero to Detail: A Progressive Spectral Decoupling Paradigm for UHD Image Restoration with New Benchmark

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

A new framework called ERR decomposes UHD image restoration into three frequency stages with specialized sub-networks and introduces the LSUHDIR benchmark dataset of over 82,000 images.

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

cs.CV · 2026-04-08 · unverdicted · novelty 7.0

RefineAnything is a multimodal diffusion model using Focus-and-Refine crop-and-resize with blended paste-back to achieve high-fidelity local image refinement and near-perfect background preservation.

PixelU: A U-Shaped Transformer for Efficient End-to-End Pixel Diffusion

cs.CV · 2026-06-26 · unverdicted · novelty 6.0

PixelU is a minimalist U-shaped Diffusion Transformer for pixel-space diffusion that decouples frequencies with zero-cost skip connections and constant-channel downsampling, outperforming baselines like JiT-G at 1/3 the compute cost with FID 1.63 on ImageNet 256x256.

RaPD: Resolution-Agnostic Pixel Diffusion via Semantics-Enriched Implicit Representations

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

RaPD enables resolution-agnostic image generation by diffusing in a semantics-enriched continuous Neural Image Field latent space using semantic guidance and a coordinate-queried attention renderer.

PixIE: Prompted Pixel-Space Low-Light Image Enhancement

cs.CV · 2026-05-22 · unverdicted · novelty 5.0

PixIE proposes a feed-forward pixel-space low-light image enhancement network using DINO-prompted pixel blocks, spatial-channel compaction, and multi-receptive-field embeddings, claiming 1.9-15.0% PSNR gains and 8.5-44.4% LPIPS reductions on benchmarks.

PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

cs.CV · 2026-05-19 · unverdicted · novelty 5.0

PixVerve introduces a 95K ultra-high-resolution image-text dataset and training strategies that enable native 100-megapixel text-to-image generation together with a new evaluation benchmark.

FrequencyBooster: Full-Frequency Modeling for High-Fidelity Pixel Diffusion

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

FrequencyBooster reports state-of-the-art FID scores of 1.60 at 256x256 and 1.69 at 512x512 for pixel diffusion by using a specialized decoder for full-frequency modeling.

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

cs.CV · 2026-05-12 · unverdicted · novelty 5.0

SenseNova-U1 presents native unified multimodal models that match top understanding VLMs while delivering strong performance in image generation, infographics, and interleaved tasks via the NEO-unify architecture.

UniCSG: Unified High-Fidelity Content-Constrained Style-Driven Generation via Staged Semantic and Frequency Disentanglement

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

UniCSG adds staged semantic disentanglement and frequency-aware reconstruction to DiT diffusion models to improve content preservation and style fidelity in both text- and reference-guided generation.

HyperDiT: Hyper-Connected Transformers for High-Fidelity Pixel-Space Diffusion

cs.CV · 2026-05-15 · unverdicted · novelty 4.0 · 2 refs

HyperDiT reports FID 1.56 on ImageNet 256x256 using hyper-connected cross-scale attention, SA-RoPE, and VFM registers in pixel space.

citing papers explorer

Showing 4 of 4 citing papers after filters.

From Zero to Detail: A Progressive Spectral Decoupling Paradigm for UHD Image Restoration with New Benchmark cs.CV · 2026-04-17 · unverdicted · none · ref 73
A new framework called ERR decomposes UHD image restoration into three frequency stages with specialized sub-networks and introduces the LSUHDIR benchmark dataset of over 82,000 images.
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details cs.CV · 2026-04-08 · unverdicted · none · ref 9
RefineAnything is a multimodal diffusion model using Focus-and-Refine crop-and-resize with blended paste-back to achieve high-fidelity local image refinement and near-perfect background preservation.
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture cs.CV · 2026-05-12 · unverdicted · none · ref 20
SenseNova-U1 presents native unified multimodal models that match top understanding VLMs while delivering strong performance in image generation, infographics, and interleaved tasks via the NEO-unify architecture.
UniCSG: Unified High-Fidelity Content-Constrained Style-Driven Generation via Staged Semantic and Frequency Disentanglement cs.CV · 2026-04-20 · unverdicted · none · ref 3
UniCSG adds staged semantic disentanglement and frequency-aware reconstruction to DiT diffusion models to improve content preservation and style fidelity in both text- and reference-guided generation.

arXiv preprint arXiv:2511.18822 (2025)

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer