hub Canonical reference

Imitating human behaviour with dif- fusion models

URL https://arxiv · 2023 · arXiv 2301.10677

Canonical reference. 100% of citing Pith papers cite this work as background.

19 Pith papers citing it

Background 100% of classified citations

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7

citation-polarity summary

background 7

representative citing papers

Generative Modeling with Flux Matching

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices beyond traditional score matching.

Test-time Sparsity for Extreme Fast Action Diffusion

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

Test-time sparsity with a parallel pipeline and omnidirectional feature reuse accelerates action diffusion by 5x to 47.5 Hz while cutting FLOPs 92% with no performance loss.

Aligning Flow Map Policies with Optimal Q-Guidance

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

Flow map policies enable fast one-step inference for flow-based RL policies, and FMQ provides an optimal closed-form Q-guided target for offline-to-online adaptation under trust-region constraints, achieving SOTA performance.

Action Images: End-to-End Policy Learning via Multiview Video Generation

cs.CV · 2026-04-07 · unverdicted · novelty 7.0

Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.

Diffusion Policy with Bayesian Expert Selection for Active Multi-Target Tracking

cs.RO · 2026-04-03 · unverdicted · novelty 7.0

A Bayesian expert selection framework with variational Bayesian last layers and lower confidence bounds improves diffusion policies for active multi-target tracking.

RoboDreamer: Learning Compositional World Models for Robot Imagination

cs.RO · 2024-04-18 · unverdicted · novelty 7.0

RoboDreamer factorizes video generation using language primitives to achieve compositional generalization in robot world models, outperforming monolithic baselines on unseen goals in RT-X.

Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning

cs.LG · 2022-08-12 · unverdicted · novelty 7.0

Diffusion-QL uses conditional diffusion models as expressive policies in offline RL by coupling behavior cloning with Q-value maximization, achieving SOTA on most D4RL tasks.

Frequency-Guided Action Diffusion via Sub-Frequency Manifold Traversal

cs.RO · 2026-05-27 · unverdicted · novelty 6.0

FGO guides diffusion policy generation via expanding spectral bands on sub-frequency manifolds to improve action smoothness on 15 robotic manipulation tasks.

Adversarial Dual On-Policy Distillation from Expressive Teacher

cs.LG · 2026-05-26 · unverdicted · novelty 6.0

FA-OPD co-trains a flow-matching teacher and MLP student via adversarial dual on-policy distillation, improving robustness over baselines on six robot benchmarks with noisy or limited demonstrations.

Mechanisms of Misgeneralization in Physical Sequence Modeling

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

Generative sequence models for physical tasks exhibit physical misgeneralization where local prediction errors propagate through physical measurements to distort aggregate distributions over quantities like distance or energy; a data deviation kernel explains and predicts the shifts and supports a内核

OGPO: Sample Efficient Full-Finetuning of Generative Control Policies

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

OGPO enables sample-efficient full-finetuning of generative control policies via off-policy critics and modified PPO, achieving SOTA on robot manipulation tasks while rescuing poorly initialized behavior cloning policies without expert data.

FingerViP: Learning Real-World Dexterous Manipulation with Fingertip Visual Perception

cs.RO · 2026-04-23 · conditional · novelty 6.0

FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.

Learning Native Continuation for Action Chunking Flow Policies

cs.RO · 2026-02-13 · unverdicted · novelty 6.0

Legato trains flow-based VLA policies with schedule-shaped action-noise mixtures and randomized conditions to achieve smoother trajectories and ~10% faster task completion than real-time chunking across five real-world manipulation tasks.

Real-Time Execution of Action Chunking Flow Policies

cs.RO · 2025-06-09 · unverdicted · novelty 6.0

Real-time chunking (RTC) allows diffusion- and flow-based action chunking policies to execute smoothly and asynchronously, maintaining high success rates on dynamic tasks even with significant inference latency.

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

cs.CV · 2025-03-13 · unverdicted · novelty 6.0

HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

cs.RO · 2024-11-29 · unverdicted · novelty 6.0

CogACT is a new VLA model that uses a conditioned diffusion action transformer to achieve over 35% higher average success rates than OpenVLA in simulation and 55% in real-robot experiments while generalizing to new robots and objects.

Diffusion Policy Policy Optimization

cs.RO · 2024-09-01 · unverdicted · novelty 6.0

DPPO fine-tunes diffusion policies via policy gradients and outperforms prior RL approaches for diffusion policies and PG-tuned alternatives on robot benchmarks while enabling stable training and hardware deployment.

Lagrangian Perturbation Diffusion Steering: Latent Reinforcement Learning for Generative Policies

cs.LG · 2026-05-31 · unverdicted · novelty 5.0

LP-DS improves generative policies for imitation and RL by optimizing latent noise perturbations with a constrained Lagrangian objective, showing up to 25% better returns on manipulation and locomotion tasks.

General Covariant Action Modeling: Constructing Generalized Manifolds via Spatio-Temporal Decoupling

cs.CV · 2026-05-27 · unverdicted · novelty 4.0

GAM framework uses arc-length parameterization for temporal invariance and schema-affine factorization for geometric invariance to build a covariant action manifold integrated into VLA models for improved generalization from sparse data.

citing papers explorer

Showing 19 of 19 citing papers.

Generative Modeling with Flux Matching cs.LG · 2026-05-08 · unverdicted · none · ref 50
Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices beyond traditional score matching.
Test-time Sparsity for Extreme Fast Action Diffusion cs.CV · 2026-05-13 · unverdicted · none · ref 21
Test-time sparsity with a parallel pipeline and omnidirectional feature reuse accelerates action diffusion by 5x to 47.5 Hz while cutting FLOPs 92% with no performance loss.
Aligning Flow Map Policies with Optimal Q-Guidance cs.LG · 2026-05-12 · unverdicted · none · ref 34
Flow map policies enable fast one-step inference for flow-based RL policies, and FMQ provides an optimal closed-form Q-guided target for offline-to-online adaptation under trust-region constraints, achieving SOTA performance.
Action Images: End-to-End Policy Learning via Multiview Video Generation cs.CV · 2026-04-07 · unverdicted · none · ref 44
Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.
Diffusion Policy with Bayesian Expert Selection for Active Multi-Target Tracking cs.RO · 2026-04-03 · unverdicted · none · ref 12
A Bayesian expert selection framework with variational Bayesian last layers and lower confidence bounds improves diffusion policies for active multi-target tracking.
RoboDreamer: Learning Compositional World Models for Robot Imagination cs.RO · 2024-04-18 · unverdicted · none · ref 44
RoboDreamer factorizes video generation using language primitives to achieve compositional generalization in robot world models, outperforming monolithic baselines on unseen goals in RT-X.
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning cs.LG · 2022-08-12 · unverdicted · none · ref 12
Diffusion-QL uses conditional diffusion models as expressive policies in offline RL by coupling behavior cloning with Q-value maximization, achieving SOTA on most D4RL tasks.
Frequency-Guided Action Diffusion via Sub-Frequency Manifold Traversal cs.RO · 2026-05-27 · unverdicted · none · ref 31
FGO guides diffusion policy generation via expanding spectral bands on sub-frequency manifolds to improve action smoothness on 15 robotic manipulation tasks.
Adversarial Dual On-Policy Distillation from Expressive Teacher cs.LG · 2026-05-26 · unverdicted · none · ref 12
FA-OPD co-trains a flow-matching teacher and MLP student via adversarial dual on-policy distillation, improving robustness over baselines on six robot benchmarks with noisy or limited demonstrations.
Mechanisms of Misgeneralization in Physical Sequence Modeling cs.LG · 2026-05-19 · unverdicted · none · ref 103
Generative sequence models for physical tasks exhibit physical misgeneralization where local prediction errors propagate through physical measurements to distort aggregate distributions over quantities like distance or energy; a data deviation kernel explains and predicts the shifts and supports a内核
OGPO: Sample Efficient Full-Finetuning of Generative Control Policies cs.LG · 2026-05-04 · unverdicted · none · ref 16
OGPO enables sample-efficient full-finetuning of generative control policies via off-policy critics and modified PPO, achieving SOTA on robot manipulation tasks while rescuing poorly initialized behavior cloning policies without expert data.
FingerViP: Learning Real-World Dexterous Manipulation with Fingertip Visual Perception cs.RO · 2026-04-23 · conditional · none · ref 51
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
Learning Native Continuation for Action Chunking Flow Policies cs.RO · 2026-02-13 · unverdicted · none · ref 27
Legato trains flow-based VLA policies with schedule-shaped action-noise mixtures and randomized conditions to achieve smoother trajectories and ~10% faster task completion than real-time chunking across five real-world manipulation tasks.
Real-Time Execution of Action Chunking Flow Policies cs.RO · 2025-06-09 · unverdicted · none · ref 46
Real-time chunking (RTC) allows diffusion- and flow-based action chunking policies to execute smoothly and asynchronously, maintaining high success rates on dynamic tasks even with significant inference latency.
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model cs.CV · 2025-03-13 · unverdicted · none · ref 52
HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation cs.RO · 2024-11-29 · unverdicted · none · ref 50
CogACT is a new VLA model that uses a conditioned diffusion action transformer to achieve over 35% higher average success rates than OpenVLA in simulation and 55% in real-robot experiments while generalizing to new robots and objects.
Diffusion Policy Policy Optimization cs.RO · 2024-09-01 · unverdicted · none · ref 66
DPPO fine-tunes diffusion policies via policy gradients and outperforms prior RL approaches for diffusion policies and PG-tuned alternatives on robot benchmarks while enabling stable training and hardware deployment.
Lagrangian Perturbation Diffusion Steering: Latent Reinforcement Learning for Generative Policies cs.LG · 2026-05-31 · unverdicted · none · ref 20
LP-DS improves generative policies for imitation and RL by optimizing latent noise perturbations with a constrained Lagrangian objective, showing up to 25% better returns on manipulation and locomotion tasks.
General Covariant Action Modeling: Constructing Generalized Manifolds via Spatio-Temporal Decoupling cs.CV · 2026-05-27 · unverdicted · none · ref 24
GAM framework uses arc-length parameterization for temporal invariance and schema-affine factorization for geometric invariance to build a covariant action manifold integrated into VLA models for improved generalization from sparse data.

Imitating human behaviour with dif- fusion models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer