pith. sign in

Gligen: Open-set grounded text-to-image genera- tion

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.CV 2

years

2025 1 2023 1

verdicts

UNVERDICTED 2

roles

background 1

polarities

background 1

representative citing papers

Visual Instruction Tuning

cs.CV · 2023-04-17 · unverdicted · novelty 7.0

LLaVA is trained on GPT-4 generated visual instruction data to achieve 85.1% relative performance to GPT-4 on synthetic multimodal tasks and 92.53% accuracy on Science QA.

citing papers explorer

Showing 2 of 2 citing papers.

  • Visual Instruction Tuning cs.CV · 2023-04-17 · unverdicted · none · ref 30

    LLaVA is trained on GPT-4 generated visual instruction data to achieve 85.1% relative performance to GPT-4 on synthetic multimodal tasks and 92.53% accuracy on Science QA.

  • MaskAttn-SDXL: Controllable Region-Level Text-To-Image Generation cs.CV · 2025-09-18 · unverdicted · none · ref 10

    MaskAttn-SDXL adds token-conditioned spatial gating to SDXL cross-attention to sparsify irrelevant token-to-location bindings and improve region-level controllability without retraining or inference edits.