pith. sign in

hub Canonical reference

Scalable diffusion models with transformers

Canonical reference. 75% of citing Pith papers cite this work as background.

15 Pith papers citing it
Background 75% of classified citations

hub tools

citation-role summary

background 7 method 1

citation-polarity summary

clear filters

representative citing papers

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Beyond Text Prompts: Visual-to-Visual Generation as A Unified Paradigm

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

Proposes V2V-Zero, a training-free framework replacing text conditioning with VLM final-layer hidden states from visual pages, achieving 0.85 on GenEval and 32.7/100 on new Simple-V2V Bench across models including video extension.

Improved Baselines with Representation Autoencoders

cs.CV · 2026-05-18 · conditional · novelty 6.0

RAE v2 reaches gFID 1.06 on ImageNet-256 in 80 epochs by combining multi-layer encoder sums, complementary REPA targets, and free guidance via output reparameterization.

CoreFlow: Low-Rank Matrix Generative Models

cs.LG · 2026-04-27 · unverdicted · novelty 6.0

CoreFlow is a low-rank matrix generative model that trains normalizing flows on shared subspaces to improve efficiency and quality for high-dimensional limited-sample data, including incomplete matrices.

LTX-2: Efficient Joint Audio-Visual Foundation Model

cs.CV · 2026-01-06 · conditional · novelty 5.0

LTX-2 generates high-quality synchronized audiovisual content from text prompts via an asymmetric 14B-video / 5B-audio dual-stream transformer with cross-attention and modality-aware guidance.

citing papers explorer

Showing 1 of 1 citing paper after filters.