In: ECCV (2022),https://arxiv.org/abs/2112.12750

Mu, N · 2022 · arXiv 2112.12750

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Bottleneck Tokens for Unified Multimodal Retrieval

cs.LG · 2026-04-13 · unverdicted · novelty 7.0

Bottleneck Tokens paired with a masked generative objective achieve state-of-the-art unified multimodal retrieval performance among 2B-scale models on the MMEB-V2 benchmark with 78 datasets.

Hierarchical Text-Conditional Image Generation with CLIP Latents

cs.CV · 2022-04-13 · accept · novelty 7.0

A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.

citing papers explorer

Showing 2 of 2 citing papers.

Bottleneck Tokens for Unified Multimodal Retrieval cs.LG · 2026-04-13 · unverdicted · none · ref 17
Bottleneck Tokens paired with a masked generative objective achieve state-of-the-art unified multimodal retrieval performance among 2B-scale models on the MMEB-V2 benchmark with 78 datasets.
Hierarchical Text-Conditional Image Generation with CLIP Latents cs.CV · 2022-04-13 · accept · none · ref 31
A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.

In: ECCV (2022),https://arxiv.org/abs/2112.12750

fields

years

verdicts

representative citing papers

citing papers explorer