A new large-scale triplet dataset and diffusion transformer model using coarse human masks deliver improved video virtual try-on quality and generalization in challenging real-world conditions.
Alvarez, and Ping Luo
12 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 12verdicts
UNVERDICTED 12roles
background 3polarities
background 3representative citing papers
RAIL-BENCH is the first standardized benchmark suite for railway perception with five challenges, real-world datasets, and a novel LineAP metric for rail track detection.
SEM-ROVER generates large multiview-consistent 3D urban driving scenes via semantic-conditioned diffusion on Σ-Voxfield voxel grids with progressive outpainting and deferred rendering.
A RANSAC-based geometric gate routes regions to homography or optical flow warping before SSP fusion, improving mIoU by 4.24-4.91% on synthetic UAVid with only 211K added parameters to frozen backbones.
A joint finite-sample certificate for adaptive selective conformal risk control that treats selected risk as a ratio and couples empirical-Bernstein, Clopper-Pearson, and closeness bounds.
FWAV-Sim is a high-fidelity Unity simulation framework for flapping-wing vehicles that integrates blade-element aerodynamics with bluff-body drag, spatiotemporally correlated fractal turbulence, and realistic IMU/LiDAR/RGB sensor models to support autonomy development.
SegRAG is a training-free retrieval-augmented framework that extracts class-specific point prompts from a filtered DINOv3 feature bank to boost SAM3 semantic segmentation performance on standard and agricultural benchmarks.
VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.
Petro-SAM adapts SAM via a Merge Block for polarized views plus multi-scale fusion and color-entropy priors to jointly achieve grain-edge and lithology segmentation in petrographic images.
Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.
MMSD and SAMR achieve 99 percent and 99.1 percent average data reduction for traffic images by transmitting segmentation maps, edges, text or semantically masked JPEGs and reconstructing via diffusion or inpainting models.
Stabilized SegFormer-B5 reaches 0.4572 mIoU SOTA on original Apple DMS split; 80/10/10 split reaches 0.5276 mIoU but degrades real-world OOD performance per qualitative review.
citing papers explorer
-
A Joint Finite-Sample Certificate for Adaptive Selective Conformal Risk Control
A joint finite-sample certificate for adaptive selective conformal risk control that treats selected risk as a ratio and couples empirical-Bernstein, Clopper-Pearson, and closeness bounds.