pith. sign in

hub Mixed citations

arXiv preprint arXiv:2410.18514 , year=

Mixed citation behavior. Most common role is background (60%).

16 Pith papers citing it
Background 60% of classified citations

hub tools

citation-role summary

background 3 method 1 other 1

citation-polarity summary

fields

cs.CL 9 cs.LG 7

years

2026 11 2025 5

representative citing papers

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Continuous Language Diffusion as a Decoder-Interface Problem

cs.CL · 2026-06-07 · unverdicted · novelty 7.0

Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.

Discrete Langevin-Inspired Posterior Sampling

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

ΔLPS is a gradient-guided discrete posterior sampler for inverse problems that works with masked or uniform discrete diffusion priors and outperforms prior discrete methods on image restoration tasks.

Fixed-Point Masked Generative Modeling

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

FP-MGMs with consistency loss and three-state reuse (CoFRe) reduce parameters by up to 38.8% and improve low-budget perplexity and FID versus standard masked generative models on text and images.

Self-Supervised On-Policy Distillation for Reasoning Language Models

cs.LG · 2026-05-17 · unverdicted · novelty 6.0

SSOPD converts intra-group correct-wrong contrast into process supervision by distilling a teacher distribution from the shortest correct completion into prefixes of the longest wrong completion, improving GRPO on AIME and HMMT benchmarks.

Coupling Models for One-Step Discrete Generation

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.

Continuous Latent Diffusion Language Model

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model

Dream 7B: Diffusion Large Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 6.0

Dream 7B is a 7B diffusion LLM that refines sequences in parallel via denoising and outperforms prior diffusion models on general, mathematical, and coding benchmarks with added flexibility in generation order and quality-speed tradeoffs.

citing papers explorer

Showing 16 of 16 citing papers.