GMO-E²DIT is an agentic framework that decouples VLM-based edit planning from mask-conditioned rendering using reflection loops for reliable multi-operation e-commerce image editing.
Mcie: Multimodal llm-driven complex instruction image editing with spatial guidance.arXiv preprint arXiv:2602.07993, 2026
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GMO-E$^2$DIT: Grounded Multi-Operation Editing for E-Commerce Images
GMO-E²DIT is an agentic framework that decouples VLM-based edit planning from mask-conditioned rendering using reflection loops for reliable multi-operation e-commerce image editing.