pith. sign in

Anole: An open, autoregressive, native large multimodal models for interleaved image-text generation

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

citation-role summary

background 1 baseline 1

citation-polarity summary

years

2026 7 2025 4

representative citing papers

ETCHR: Editing To Clarify and Harness Reasoning

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

A decoupled question-conditioned image editor trained via supervised imitation then VLM-reward enhancement improves MLLM visual reasoning Pass@1 by 4.6-5.5 points across models and tasks.

CASCADE: Context-Aware Relaxation for Speculative Image Decoding

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

CASCADE formalizes semantic interchangeability and convergence in target model representations to enable context-aware acceptance relaxation in tree-based speculative decoding, delivering up to 3.6x speedup on text-to-image models without quality loss.

Mull-Tokens: Modality-Agnostic Latent Thinking

cs.CV · 2025-12-11 · unverdicted · novelty 6.0

Mull-Tokens are modality-agnostic latent tokens that enable free-form multimodal thinking and deliver up to 16% gains on spatial reasoning benchmarks.

Illuminating Unified Multimodal Model for Free-form Interleaved Text-Image Generation

cs.CV · 2026-06-29 · unverdicted · novelty 4.0

ILLUME-X is a unified multimodal model that generates free-form interleaved text-image sequences via an expanded data pipeline, progressive self-adaptive training, and ILScore evaluation, claiming outperformance over prior unified models on style transfer, image decomposition, and storytelling.

citing papers explorer

Showing 11 of 11 citing papers.