{"total":17,"items":[{"citing_arxiv_id":"2605.20640","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Pareto-Enhanced Portrait Generation: Vision-Aligned Text Supervision for Alignment, Realism, and Aesthetics","primary_cat":"cs.CV","submitted_at":"2026-05-20T02:55:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A feature supervision approach using SigLIP 2 extracts multi-granularity vision-aligned text representations to supervise MM-DiT image branches, pushing the Pareto frontier for portrait generation across alignment, realism, and aesthetics.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19771","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Beyond Imitation: Learning Safe End-to-End Autonomous Driving from Hard Negatives","primary_cat":"cs.RO","submitted_at":"2026-05-19T12:41:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BeyondDrive augments imitation learning with synthesized safety-critical negative trajectories and a repulsive loss to improve safety in autonomous driving, reporting 89.7 PDMS on NAVSIMv1 and generalization to other models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16573","ref_index":29,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Wavelet Flow Matching for Multi-Scale Physics Emulation","primary_cat":"cs.LG","submitted_at":"2026-05-15T19:24:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Wavelet Flow Matching emulates multi-scale PDE-governed systems by transporting velocities directly in a hierarchical wavelet representation via U-Net, yielding improved long-horizon stability and spectral accuracy on fluid benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12754","ref_index":35,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Constraint-Aware Flow Matching: Decision Aligned End-to-End Training for Constrained Sampling","primary_cat":"cs.LG","submitted_at":"2026-05-12T21:07:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Constraint-Aware Flow Matching integrates constraint projections into the flow matching training objective to align model dynamics with constrained sampling and reduce distributional shift.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07456","ref_index":76,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Inference-Time Attribute Distribution Alignment for Unconditional Diffusion","primary_cat":"cs.LG","submitted_at":"2026-05-08T09:02:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An optimal control formulation adds time-dependent perturbations to the reverse diffusion process to match target attribute distributions while preserving sample fidelity.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"genders, and races, mitigating potential biases introduced by pretrained diffusion modelswith- out retraining or fine-tuning them. Beyond the standard diffusion paradigm, we also empirically demonstrate the applicability of our method under flow matching in the latent space. Setup:We use a DDIM [ 39] pretrained by Choi et al. [74] and a pretrained Latent Flow Matching (LFM) [75] model; both models were trained on the FFHQ-256 dataset [ 76]. We consider three attributes in this task: gender, race, and age. For gender, we consider{Female,Male}; for race, we adopt the 4-way classification by Karkkainen and Joo [77]: {Asian, Black, Indian, WMELH} 6; for age, we partition it into three groups: {Junior, Middle, Senior} 7. We test with both fairness targets and customized skewed targets for each single attribute andjointattributes."},{"citing_arxiv_id":"2605.06364","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Flow Matching with Arbitrary Auxiliary Paths","primary_cat":"cs.LG","submitted_at":"2026-05-07T14:39:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"AuxPath-FM extends flow matching to arbitrary auxiliary distributions while preserving the continuity equation and marginal training objective.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.06124","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"P-Guide: Parameter-Efficient Prior Steering for Single-Pass CFG Inference","primary_cat":"cs.AI","submitted_at":"2026-05-07T12:31:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"P-Guide achieves single-pass classifier-free guidance in flow matching by modulating the initial latent state and is equivalent to standard CFG under a first-order approximation while cutting latency by half.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.03623","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Few-Step Generative Model on Cumulative Flow Maps","primary_cat":"cs.LG","submitted_at":"2026-05-05T10:51:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Cumulative flow maps unify few-step generative modeling for diffusion and flow models via cumulative transport and parameterization with minimal changes to time embeddings and objectives.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.06701","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Bi-Lipschitz Autoencoder With Injectivity Guarantee","primary_cat":"cs.LG","submitted_at":"2026-04-08T05:40:00+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BLAE adds injective regularization via a separation criterion and bi-Lipschitz constraints to guarantee injectivity and geometric preservation in autoencoders, outperforming prior methods on manifold fidelity under sparsity and distribution shifts.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.04487","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Training-Free Image Editing with Visual Context Integration and Concept Alignment","primary_cat":"cs.CV","submitted_at":"2026-04-06T07:26:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"VicoEdit performs training-free image editing by transforming source images directly with visual context and concept-alignment-guided posterior sampling, outperforming training-based methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2603.26357","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model","primary_cat":"cs.CV","submitted_at":"2026-03-27T12:30:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MPDiT uses a hierarchical multi-patch design in transformers to lower computation in diffusion models by handling coarse global features first then fine local details, plus faster-converging embeddings.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"order numerical solvers [39, 81] and diffusion distillation [42, 43, 54, 77, 78]. Sampling efficiency has been ex- tensively studied, and recent distillation methods [12, 14, 77, 78, 82] can achieve performance comparable to their teacher diffusion models. For training efficiency, recent works have shifted from pixel diffusion [27] to latent dif- fusion [10, 52], which is considerably faster. However, training diffusion models in latent space still demands sub- stantial computational resources. To mitigate this, ongo- ing research focuses on developing more compact and se- mantically meaningful variational autoencoders (V AEs) [7, 8, 34, 75] that provide efficient representations and enable faster convergence."},{"citing_arxiv_id":"2601.21831","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Generative Modeling of Discrete Data Using Geometric Latent Subspaces","primary_cat":"stat.ML","submitted_at":"2026-01-29T15:14:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A geometric latent-subspace model on Riemannian manifolds of categorical distributions enables low-dimensional generative modeling of discrete data via isometries and geometric PCA for flow matching.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.15884","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Contrast-X: A Multi-Modal Contrast Image Synthesis Benchmark and Universal Modality Flow Matching","primary_cat":"cs.CV","submitted_at":"2026-01-22T11:58:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Contrast-X benchmark and FlowMI model enable synthesis of contrast-enhanced images from arbitrary non-contrast modality inputs using multi-modal flow matching.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.24517","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Physics Priors Offer Useful Accuracy-Carbon Trade-Offs in Spatio-Temporal Forecasting","primary_cat":"cs.LG","submitted_at":"2025-09-29T09:34:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Stronger physics priors in neural networks for spatio-temporal shear flow forecasting yield substantially lower training carbon footprints than weak or no priors, though inference savings are less consistent.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.02276","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Latent Stochastic Interpolants","primary_cat":"cs.LG","submitted_at":"2025-06-02T21:34:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Latent Stochastic Interpolants jointly optimize encoder-decoder and a latent-space stochastic interpolant using a continuous-time ELBO to transform arbitrary priors into aggregated posteriors.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.02514","ref_index":1,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Privacy Attacks on Image AutoRegressive Models","primary_cat":"cs.CV","submitted_at":"2025-02-04T17:33:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Image autoregressive models leak substantially more training data than diffusion models under membership inference, dataset inference with as few as 4 samples, and data extraction attacks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2410.07430","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"EventFlow: Forecasting Temporal Point Processes with Flow Matching","primary_cat":"cs.LG","submitted_at":"2024-10-09T20:57:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"EventFlow applies flow matching to learn joint distributions over event times for temporal point processes, reporting 20-53% lower forecast error than autoregressive baselines on standard TPP benchmarks with fewer sampling calls.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}