ALMs unify pretrained atomistic encoder, LLM, and denoising diffusion via continuous projectors and staged training to reach SOTA on text-conditioned crystal prediction and de novo generation.
hub Mixed citations
Johnson, Jonathan Ho, Daniel Tarlow, and Rianne van den Berg
Mixed citation behavior. Most common role is background (67%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
TimeROME-DLM enables training-free knowledge editing in masked diffusion language models via temporal causal tracing and low-rank residual edit memory applied at inference time.
Constrained Diffusion for Code (CDC) integrates constraint satisfaction into the reverse denoising process of discrete diffusion models via constraint-aware operators that use optimization and program analysis to steer generation toward feasible programs.
A framework pretrained on authentic binary occlusion masks uses guided sampling and intersection-based partitioning to train diffusion models on incomplete physical observations without zero-query regions.
DCDM replaces positional blocks with learnable semantic chunks via differentiable Chunking Attention, yielding consistent gains over block and unstructured diffusion baselines up to 1.5B parameters.
Introduces Block-R1 benchmark, Block-R1-41K dataset, and a conflict score to handle domain-specific optimal block sizes in RL post-training of diffusion LLMs.
SCMDM is a post-training self-conditioning adaptation for masked diffusion models that reduces generative perplexity by nearly 50% on OWT and improves performance on images, molecules, and genomics.
Full-sequence masking in SFT unlocks prompt infilling for masked diffusion language models, producing templates that match or surpass hand-designed ones and transfer across models.
A CNN-based discrete diffusion method refines sparse contours from segmentation masks using simplified denoising steps and minimal post-processing, outperforming baselines on small medical and environmental datasets while running 3.5 times faster.
A prompt-controlled diffusion framework generates class-ratio-targeted synthetic layouts and domain-consistent images that, when mixed with real data, improve segmentation accuracy on long-tailed remote-sensing datasets especially under domain shift.
Early and late denoising steps in masked diffusion LMs are robust to smaller-model replacement, enabling 17% FLOPs reduction with modest generative quality loss.
Progressive distillation halves sampling steps repeatedly in diffusion models, reaching 4 steps with FID 3.0 on CIFAR-10 from 8192-step samplers.
Diffusion LM matches AR performance on medical VQA, runs 3.5-4.4x faster, and enables bidirectional infilling for interactive radiology report drafting.
R2LM combines causal attention with a reverse Mamba SSM sidecar to supply right-side context in dLLMs, claiming 2.4x-12.9x throughput gains over bidirectional dLLMs and 1.9x-2.9x over AR baselines while matching or exceeding quality.
A narrative survey that catalogs fifty papers on diffusion-based adversarial techniques across text, vision, and vision-language models, proposes a six-class taxonomy of diffusion roles plus a unified five-dimension evaluation framework, and releases a companion catalog.
dVLA-RL models denoising as an MDP to enable RL on dVLAs via trajectory probabilities, reporting 99.7% success on LIBERO and 30.6% gains over SFT on RoboTwin 2.0.
VRCD prioritizes visually complementary positions during parallel decoding in dMLLMs by measuring attention overlap with the new Visual Redundancy Index, yielding accuracy gains over confidence-based baselines on M^3CoT and MMBench.
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.
GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.
FRIGID scales a diffusion-based model for de novo molecular structure generation from mass spectra, reaching over 18% top-1 accuracy on MassSpecGym and tripling prior bests on NPLIB1 via large unlabeled training and inference-time fragmentation refinement with log-linear compute scaling.
GraphBSI uses Bayesian Sample Inference as noise-controlled SDEs to generate discrete graphs in one shot, achieving state-of-the-art results on molecular benchmarks Moses and GuacaMol.
Saber improves both speed and accuracy of diffusion language models on code generation by dynamically adjusting unmasking steps and reverting low-confidence tokens via backtracking.
Logit-KL Flow Matching recovers the flow-matching velocity field from conditional likelihood maximization and uses iterative denoise-re-noise sampling to improve perplexity and downstream metrics over prior NAR baselines on text and code tasks.
citing papers explorer
-
Discrete Diffusion Language Models for Interactive Radiology Report Drafting
Diffusion LM matches AR performance on medical VQA, runs 3.5-4.4x faster, and enables bidirectional infilling for interactive radiology report drafting.
-
GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model
GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.
-
Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model
Saber improves both speed and accuracy of diffusion language models on code generation by dynamically adjusting unmasking steps and reverting low-confidence tokens via backtracking.