OmniMamba: Efficient and unified multimodal understanding and generation via state space models.arXiv preprint arXiv:2503.08686, 2025.https: //arxiv.org/abs/2503.08686

Jialv Zou, Bencheng Liao, Qian Zhang, Wenyu Liu, Xinggang Wang · 2025 · arXiv 2503.08686

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards

cs.CV · 2026-06-25 · unverdicted · novelty 6.0

A self-evolving framework with proposer-solver-generator roles, Solver Token Entropy, and multi-scale internal evaluation improves unified LMMs on understanding and generation tasks using only self-derived consistency signals.

Safe Autoregressive Image Generation with Iterative Self-Improving Codebooks

cs.CV · 2026-06-25 · unverdicted · novelty 5.0

Iterative self-improving codebooks enhance safety in autoregressive multimodal models by self-identifying unsafe generations and updating the codebook to eliminate harmful visual token mappings without external feedback.

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

cs.CV · 2026-04-30 · unverdicted · novelty 5.0

Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.

citing papers explorer

Showing 3 of 3 citing papers.

Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards cs.CV · 2026-06-25 · unverdicted · none · ref 50
A self-evolving framework with proposer-solver-generator roles, Solver Token Entropy, and multi-scale internal evaluation improves unified LMMs on understanding and generation tasks using only self-derived consistency signals.
Safe Autoregressive Image Generation with Iterative Self-Improving Codebooks cs.CV · 2026-06-25 · unverdicted · none · ref 32
Iterative self-improving codebooks enhance safety in autoregressive multimodal models by self-identifying unsafe generations and updating the codebook to eliminate harmful visual token mappings without external feedback.
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling cs.CV · 2026-04-30 · unverdicted · none · ref 106
Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.

OmniMamba: Efficient and unified multimodal understanding and generation via state space models.arXiv preprint arXiv:2503.08686, 2025.https: //arxiv.org/abs/2503.08686

fields

years

verdicts

representative citing papers

citing papers explorer