Editthinker: Unlocking iterative reasoning for any image editor

Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, et al · 2025 · arXiv 2512.05965

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

PhyEditBench: A Real-World Multi-Stage Benchmark for Physics-Aware Image Editing

cs.CV · 2026-06-25 · unverdicted · novelty 7.0

PhyEditBench is a new benchmark for physics-aware image editing with real and synthetic instances plus a training-free PhyWorld baseline that uses test-time scaling to outperform SOTA models.

InterleaveThinker: Reinforcing Agentic Interleaved Generation

cs.CV · 2026-06-11 · unverdicted · novelty 7.0

InterleaveThinker is the first multi-agent pipeline enabling interleaved generation in any image generator through planner-critic agents, SFT on custom datasets, and GRPO RL with accuracy and step-wise rewards.

Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

Uni-Edit introduces a data synthesis pipeline turning VQA data into reasoning-intensive editing instructions, enabling single-task tuning that boosts all three capabilities in models like BAGEL and Janus-Pro.

Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysis

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Presents Entity-Rubrics and AbstractEdit benchmark to measure image editing models on abstract intent, finding standard models struggle to balance edit intent with image preservation.

EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

EditRefiner uses a perception-reasoning-action-evaluation agent loop and the EditFHF-15K human feedback dataset to refine text-guided image edits more accurately than prior methods.

Gen-Searcher: Reinforcing Agentic Search for Image Generation

cs.CV · 2026-03-30 · unverdicted · novelty 7.0 · 2 refs

Gen-Searcher is the first trained search-augmented image generation agent using SFT followed by GRPO reinforcement learning with dual text-image rewards, delivering 15-16 point gains on knowledge-intensive benchmarks.

SPIRAL: Self-Evolving Action-Conditioned Video Generation via Reflective Planning Agents

cs.CV · 2026-03-09 · unverdicted · novelty 7.0

SPIRAL is a closed-loop think-act-reflect framework using PlanAgent, VideoGenerator, and CriticAgent plus GRPO self-evolution to improve long-horizon action-conditioned video generation, with new dataset and benchmark showing gains over open-loop baselines.

GMO-E$^2$DIT: Grounded Multi-Operation Editing for E-Commerce Images

cs.CV · 2026-07-01 · unverdicted · novelty 6.0 · 2 refs

GMO-E²DIT is an agentic editing framework that decouples VLM-based planning from mask-conditioned rendering and uses reflection to execute multi-operation e-commerce image edits with error recovery.

AesFormer: Transform Everyday Photos into Beautiful Memories

cs.CV · 2026-05-21 · unverdicted · novelty 6.0

AesFormer decouples aesthetic planning from image editing via AesThinker and AesEditor to enable structural reconstruction in photos for better aesthetics.

Latent Action Control for Reasoning-Guided Unified Image Generation

cs.CV · 2026-05-16 · unverdicted · novelty 6.0

Latent Action Control learns unobserved action trajectories via variational alignment and GRPO to inject reasoning into flow-based image generation, yielding gains on compositional benchmarks.

AdaTooler-V: Adaptive Tool-Use for Images and Videos

cs.CV · 2025-12-18 · conditional · novelty 6.0

AdaTooler-V trains MLLMs to adaptively use vision tools via AT-GRPO reinforcement learning and new datasets, reaching 89.8% on V* and outperforming GPT-4o.

ProductConsistency: Improving Product Identity Preservation in Instruction-Based Image Editing via SFT and RL

cs.CV · 2026-06-17 · unverdicted · novelty 5.0

Introduces ProductConsistency dataset, benchmark, and Cyclic Consistency reward to fine-tune image editing models, achieving a 5x reduction in character error rate for product identity preservation.

Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions

cs.CV · 2026-04-17

citing papers explorer

Showing 13 of 13 citing papers after filters.

PhyEditBench: A Real-World Multi-Stage Benchmark for Physics-Aware Image Editing cs.CV · 2026-06-25 · unverdicted · none · ref 29
PhyEditBench is a new benchmark for physics-aware image editing with real and synthetic instances plus a training-free PhyWorld baseline that uses test-time scaling to outperform SOTA models.
InterleaveThinker: Reinforcing Agentic Interleaved Generation cs.CV · 2026-06-11 · unverdicted · none · ref 23
InterleaveThinker is the first multi-agent pipeline enabling interleaved generation in any image generator through planner-critic agents, SFT on custom datasets, and GRPO RL with accuracy and step-wise rewards.
Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning cs.CV · 2026-05-20 · unverdicted · none · ref 27
Uni-Edit introduces a data synthesis pipeline turning VQA data into reasoning-intensive editing instructions, enabling single-task tuning that boosts all three capabilities in models like BAGEL and Janus-Pro.
Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysis cs.CV · 2026-05-14 · unverdicted · none · ref 37
Presents Entity-Rubrics and AbstractEdit benchmark to measure image editing models on abstract intent, finding standard models struggle to balance edit intent with image preservation.
EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement cs.CV · 2026-05-08 · unverdicted · none · ref 25
EditRefiner uses a perception-reasoning-action-evaluation agent loop and the EditFHF-15K human feedback dataset to refine text-guided image edits more accurately than prior methods.
Gen-Searcher: Reinforcing Agentic Search for Image Generation cs.CV · 2026-03-30 · unverdicted · none · ref 17 · 2 links
Gen-Searcher is the first trained search-augmented image generation agent using SFT followed by GRPO reinforcement learning with dual text-image rewards, delivering 15-16 point gains on knowledge-intensive benchmarks.
SPIRAL: Self-Evolving Action-Conditioned Video Generation via Reflective Planning Agents cs.CV · 2026-03-09 · unverdicted · none · ref 7
SPIRAL is a closed-loop think-act-reflect framework using PlanAgent, VideoGenerator, and CriticAgent plus GRPO self-evolution to improve long-horizon action-conditioned video generation, with new dataset and benchmark showing gains over open-loop baselines.
GMO-E$^2$DIT: Grounded Multi-Operation Editing for E-Commerce Images cs.CV · 2026-07-01 · unverdicted · none · ref 22 · 2 links
GMO-E²DIT is an agentic editing framework that decouples VLM-based planning from mask-conditioned rendering and uses reflection to execute multi-operation e-commerce image edits with error recovery.
AesFormer: Transform Everyday Photos into Beautiful Memories cs.CV · 2026-05-21 · unverdicted · none · ref 9
AesFormer decouples aesthetic planning from image editing via AesThinker and AesEditor to enable structural reconstruction in photos for better aesthetics.
Latent Action Control for Reasoning-Guided Unified Image Generation cs.CV · 2026-05-16 · unverdicted · none · ref 18
Latent Action Control learns unobserved action trajectories via variational alignment and GRPO to inject reasoning into flow-based image generation, yielding gains on compositional benchmarks.
AdaTooler-V: Adaptive Tool-Use for Images and Videos cs.CV · 2025-12-18 · conditional · none · ref 30
AdaTooler-V trains MLLMs to adaptively use vision tools via AT-GRPO reinforcement learning and new datasets, reaching 89.8% on V* and outperforming GPT-4o.
ProductConsistency: Improving Product Identity Preservation in Instruction-Based Image Editing via SFT and RL cs.CV · 2026-06-17 · unverdicted · none · ref 15
Introduces ProductConsistency dataset, benchmark, and Cyclic Consistency reward to fine-tune image editing models, achieving a 5x reduction in character error rate for product identity preservation.
Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions cs.CV · 2026-04-17 · unreviewed · ref 15

Editthinker: Unlocking iterative reasoning for any image editor

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer