AutoMedBench evaluates AI agents on long-horizon medical workflows across five stages and finds validation and submission as dominant failure points based on thousands of runs.
hub Canonical reference
Bovik, Hamid R
Canonical reference. 75% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
On the public ReMIND dataset, a systematic benchmark of six synthesis models across 48 experiments finds LPIPS correlates with downstream segmentation utility while SSIM does not, with SynDiff-2.5D performing best.
DirectorBench is a profile-aware diagnostic benchmark that localizes bottlenecks in long-form video generation workflows using structured checkpoints and multi-agent evaluation.
PanoPlane achieves up to 17.8% PSNR gains in sparse-view indoor novel view synthesis by using training-free plane-aware panoramic completion to supervise 3D Gaussian Splatting.
GuardMarkGS unifies watermarking and adversarial edit deterrence into a single optimization framework for protecting 3D Gaussian Splatting assets.
A new large-scale synthetic multi-task benchmark dataset supplying pixel-perfect depth, domain-shifted night imagery, and multi-scale low-resolution pairs for aerial remote sensing.
MESA restores ancient inscription textures via multi-exemplar style transfer from VGG19 features with per-layer exemplar selection and OCR-derived weights, without any model training.
GeRM learns a distribution transfer vector field via a multi-condition ControlNet to convert physically-based renders into photorealistic images using text prompts and a 50K expert-curated dataset.
LumaFlux is a physically and perceptually guided diffusion transformer for SDR-to-HDR conversion that introduces PGA, PCM, and HDR Residual Coupler modules plus a new training corpus and benchmark, outperforming prior ITM methods.
A sensor-specific calibration pipeline using dark frames produces synthesized noisy RAW images that close 54-64% of the PSNR gap to real noise versus manufacturer profiles, accompanied by the open SNIC dataset of over 6600 paired images.
DRFS is a new inversion-free editing technique for rectified flow models that models source-target velocity discrepancies and applies a time-dependent shift to improve fidelity and unify prior methods like DDS and FlowEdit.
Harder classification tasks produce neural representations whose accuracy collapses under binarization and shuffling while easier tasks remain robust, defining task complexity via the performance gap between full-precision and perturbed networks.
PhotIQA is a new public dataset of 1134 expert-rated photoacoustic images for benchmarking image quality assessment in medical imaging.
Presents SLAM&Render, a robot-recorded benchmark dataset with 40 multi-modal sequences for testing SLAM, novel view synthesis, and Gaussian Splatting under controlled variations in lighting, arrangements, and occlusions.
Proposes a cyclic 2.5D perceptual loss with manufacturer SUVR standardization for T1w MRI to tau PET synthesis, reporting improved regional agreement on ADNI and SCAN cohorts across U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.
Q-Align trains LMMs on discrete text-defined levels for visual scoring, achieving SOTA on IQA, IAA, and VQA while unifying the tasks in OneAlign.
A PINN framework with separate networks for conductivity and potentials, multiscale wavelet excitations, and FFE recovers dominant conductivity structures from finite DtN data with 3-12% relative error on synthetic tests, with FFE aiding sharp features.
Differential Unfolding replaces uniform stacking in deep unfolding networks with a heterogeneous structure of anchoring and differential evolution stages to achieve better accuracy-efficiency trade-offs in video SCI reconstruction.
Introduces Visibility-Aware Densification with Temporally-Adaptive Thresholding and Temporal Offset Warping to improve dynamic region quality in 3D Gaussian Splatting on three benchmarks.
Scene-adaptive nonlinear tone curves (ASE and AP3) with percentile normalisation and offset outperform linear gain for pseudo-GT generation in low-light 3DGS, delivering PSNR gains up to 4.34 dB on LOM and 3.25 dB on RealX3D across 21 scenes.
A plug-and-play perceptual wrapper using common random noise and Wasserstein Distortion supervision improves texture quality and reduces model size in 3D Gaussian Splatting.
High magnetic fields directly enhance the amplitude and correlation length of stripe order in a cuprate superconductor far above the vortex melting transition, indicating a coupling mechanism independent of superconductivity suppression.
LiFT factorizes 3D medical volume synthesis into per-slice 2D generation and inter-slice trajectory learning, using a tri-planar drifting loss for unconditional coherence and a z-context mixer for paired translation tasks.
MSIQ is a scale-invariant, model-free quality metric for single image super-resolution using normalized central geometric moments for direct comparison of different-resolution images.
citing papers explorer
-
StereoGenBench: A Synthetic Multi-Camera Benchmark for Stereo Generation under Controlled Baseline Regimes
StereoGenBench is a new synthetic benchmark dataset featuring calibrated multi-baseline stereo pairs with dense metric depth, intrinsics, and poses from Unreal Engine renders for controlled evaluation of stereo generation.
-
UMEDA: Unified Multi-modal Efficient Data Fusion for Privacy-Preserving Graph Federated Learning via Spectral-Gated Attention and Diffusion-Based Operator Alignment
UMEDA is a new graph federated learning method that uses low-rank spectral filtering and diffusion over a shared integral operator to fuse multi-modal data privately, outperforming baselines on MM-Fi and RELI11D under high heterogeneity and tight privacy budgets.
-
Physics-Guided Deep Learning For High Resolution X-ray Imaging
Physics-guided U-Net removes non-stationary artifacts from X-ray images, raising mean SSIM from 0.345 to 0.906 and 0.0679 to 0.945 in synthetic tests while preserving filament profiles better than Fourier filtering or DFFN.
-
Flow matching for Sentinel-2 super-resolution: implementation, application, and implications
Flow matching achieves single-step pixel accuracy and 20-step perceptual quality for Sentinel-2 super-resolution, outperforming diffusion and Real-ESRGAN while enabling large-scale 2.5 m land-cover products.
-
Training-inference input alignment outweighs framework choice in longitudinal retinal image prediction
Training-inference input alignment outweighs framework choice for longitudinal retinal image prediction, with deterministic regression matching complex models when acquisition variability dominates disease progression.
-
Multimodal Image Colorization: Quantifying the Impact of Text-Conditioned Guidance on Grayscale-to-Color Translation
Text conditioning improves PSNR by ~5.7%, SSIM by ~1.4%, colorfulness by up to 36.6%, and reduces LPIPS by ~9.5% across U-Net and Stable Diffusion colorization models.
-
GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning
GeneralVLA-2 introduces GeoFuse-MV3D for improved multi-view 3D reconstruction and a governed memory system, demonstrating modest gains on 3D object and task benchmarks.
-
Contrast-Informed Augmentation and Domain-Adversarial Training for Adult-to-Neonatal MR Reconstruction Generalization
Mixed training with contrast-informed augmentation and domain-adversarial training improves E2E-VarNet performance on neonatal T2-weighted brain MR reconstruction at R=4 and R=8 compared to adult-only training.
-
Low-Magnification SEM May Suffice: Interpretable Deep Learning for Multi-Scale Fracture-Cause Classification in Zirconia-Toughened Alumina
A fine-tuned ViT on 8493 SEM images classifies fracture causes in zirconia-toughened alumina at 0.907 accuracy and 0.888 macro-F1, with comparable performance at 50x versus higher magnifications.
-
MASQ: Accelerating Masked Diffusion via Stage-Wise Multi-Precision Quantization
MASQ claims up to 16.06x speedup and 4.18x energy gain over A100 for masked diffusion via stage-wise multi-precision quantization and specialized hardware units while preserving quality.
-
7DT Insight: Variability in Young Stellar Objects
Two-epoch medium-band photometry of 769 YSO candidates in Orion A identifies 110 variables (~14%), with best-fit templates dominated by cold and hot spot models over extinction or gray changes.
-
Layer Selection in Feature-Based Losses Affects Image Quality and Microstructural Consistency in Deep Learning Super-Resolution of Brain Diffusion MRI
Deeper VGG16 layers in feature losses for diffusion MRI super-resolution introduce persistent grid artifacts in images and anisotropy maps, whereas the shallowest layer preserves consistency with ground truth at high upsampling factors.
-
MSDS: Deep Structural Similarity with Multiscale Representation
MSDS computes DeepSSIM at multiple pyramid scales and fuses the scores with learned weights, producing consistent improvements over single-scale DeepSSIM on IQA benchmarks with negligible extra cost.
-
A GPU-enhanced workflow for non-Fourier SENSE reconstruction
A public GPU workflow for non-Fourier SENSE MRI reconstruction with sensitivity and off-resonance mapping enables fast, accurate imaging from challenging spiral trajectories.
-
Measuring the Transferability of Adversarial Examples
Empirical measurement of adversarial example transferability between VGG and Inception model classes with methodological refinements to attack strength selection, perturbation clipping, and evaluation via SSIM.
- Your Neighbors Know: Leveraging Local Neighborhoods for Backdoor Detection in Decentralized Learning