Formalizes incompleteness divergence across missing-data protocols in IMVC and proposes CRAFT, a mask-aware transformer enabling train-once robustness to diverse missing patterns.
hub Mixed citations
IEEE Transactions on Image Processing 26(5), 2274–2285 (2017)
Mixed citation behavior. Most common role is background (50%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
A mixture model with adaptive KDE and per-image cross-validation raises estimated human fixation consistency by 5-15% median log-likelihood and up to 2 AUC points over fixed-bandwidth Gaussian baselines.
PAMELA provides a multi-user rating dataset and personalized reward model that predicts individual image preferences more accurately than prior population-level aesthetic models.
LucidFlux is a caption-free image restoration method that conditions a Flux.1 diffusion transformer with a dual-branch module from the degraded input and a proxy restoration plus SigLIP semantic features to outperform baselines on synthetic and real-world data.
Introduces the TUB dataset of 1320 real turbid underwater images and PCD metric showing strong correlation with instance segmentation performance where standard metrics fail.
SENTRY is a plug-and-play module that replaces confidence-based memory writes with neighbor-aware cycle-consistent validation in SAM2 trackers, yielding new zero-shot SOTA results on LaSOT, GOT-10k and other benchmarks.
Z-Reward trains a 27B reasoning teacher VLM on score distributions via GDSO and distills it via RISD into a 9B student, reaching 89.6% and 88.6% human preference accuracy with 41.3% optimization gain over SFT baseline.
CoVEBench is a new benchmark showing that existing text-guided video editing models frequently fail on compositional instructions involving simultaneous subject, action, and camera changes.
UHD-GCN-BIQA models structural dependencies among sampled patches via a hybrid kNN graph and residual graph convolutions to achieve competitive PLCC and SRCC with the lowest RMSE on the UHD-IQA benchmark for blind ultra-high-definition image quality assessment.
Deep UCSL uses a contrastive EM loss on patient-control labels to isolate disease-driven subgroups in medical imaging by suppressing shared healthy variability.
MG-IQA trains vision-language models with attribute-aware RL2R and a multi-dimensional Thurstone reward model to jointly predict overall quality and fine-grained attributes, reporting 2.1% average SRCC gains on eight IQA benchmarks.
RealLiFe optimizes multi-plane images with HSGD to deliver real-time light field reconstruction from sparse views, claiming 100x speedup over offline methods and 2 dB PSNR gain over online ones.
Fewer than d/2 errors in line sums can be corrected in discrete tomography, with the bound shown to be optimal.
End-to-end pipeline uses ResViT-2.5D to synthesize post-resection MRI from ioUS then anchors deformable registration, yielding 5.86 mm TRE on 14 ReMIND subjects while producing an integrated whole-brain volume reflecting intraoperative state.
Authors create psychometrically scaled image sets from human tests on denoised photos and provide a HaarPSI threshold for choosing denoising parameters based on perceived similarity.
DSCC groups spectrally similar and spatially close pixels into supertokens using multi-criteria distance and soft labels, then classifies at the token level to achieve 0.728 CF1 at 197.75 FPS on WHU-OHS.
RoomRecon delivers a real-time mobile system for high-quality textured 3D room reconstructions that combines AR-guided imaging with generative AI texturing focused on permanent structures and claims to outperform prior methods in quality and speed.
DAT combines a small-large model cascade with fine-tuning and bandwidth-aware multi-stream transmission to deliver high-accuracy event recognition and low-latency alerts for video streams in edge-cloud systems.
A pointwise multivariate information-driven sampling method generates reduced datasets that preserve statistical associations among variables for effective feature queries and analysis.
The paper proposes the Aesthetic Multi-Attribute Network (AMAN) that jointly predicts captions and scores for five aesthetic attributes using a new weakly-labeled dataset created via knowledge transfer.
A DenseNet201 base model trained on a constructed plant leaf disease dataset outperforms baselines and enables faster, more robust transfer learning with less data than general models.
A literature survey on abstract concept recognition in videos that catalogs prior tasks and datasets while advocating for foundation models and reuse of decades of community experience.
Object-oriented RGB pixel distribution analysis from satellite images classifies paved versus unpaved roads in Greater Maputo.
A survey of RGB-D object detection from traditional hand-crafted features with machine learning to deep learning techniques.
citing papers explorer
-
Rethinking Incompleteness: Formalizing Protocol Divergence and Train-Once Learning for Robust IMVC
Formalizes incompleteness divergence across missing-data protocols in IMVC and proposes CRAFT, a mask-aware transformer enabling train-once robustness to diverse missing patterns.
-
Raising the Ceiling: Better Empirical Fixation Densities for Saliency Benchmarking
A mixture model with adaptive KDE and per-image cross-validation raises estimated human fixation consistency by 5-15% median log-likelihood and up to 2 AUC points over fixed-bandwidth Gaussian baselines.
-
Personalizing Text-to-Image Generation to Individual Taste
PAMELA provides a multi-user rating dataset and personalized reward model that predicts individual image preferences more accurately than prior population-level aesthetic models.
-
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer
LucidFlux is a caption-free image restoration method that conditions a Flux.1 diffusion transformer with a dual-branch module from the degraded input and a proxy restoration plus SigLIP semantic features to outperform baselines on synthetic and real-world data.
-
Beyond Aesthetics: Quantifying Information Loss in Turbid Scenes
Introduces the TUB dataset of 1320 real turbid underwater images and PCD metric showing strong correlation with instance segmentation performance where standard metrics fail.
-
SENTRY: SAM2-Enhanced Neighbor-Aware and Temporally Reasoned Memory for Visual Tracking
SENTRY is a plug-and-play module that replaces confidence-based memory writes with neighbor-aware cycle-consistent validation in SAM2 trackers, yielding new zero-shot SOTA results on LaSOT, GOT-10k and other benchmarks.
-
Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions
Z-Reward trains a 27B reasoning teacher VLM on score distributions via GDSO and distills it via RISD into a 9B student, reaching 89.6% and 88.6% human preference accuracy with 41.3% optimization gain over SFT baseline.
-
CoVEBench: Can Video Editing Models Handle Complex Instructions?
CoVEBench is a new benchmark showing that existing text-guided video editing models frequently fail on compositional instructions involving simultaneous subject, action, and camera changes.
-
Ultra-High-Definition Image Quality Assessment via Graph Representation Learning
UHD-GCN-BIQA models structural dependencies among sampled patches via a hybrid kNN graph and residual graph convolutions to achieve competitive PLCC and SRCC with the lowest RMSE on the UHD-IQA benchmark for blind ultra-high-definition image quality assessment.
-
Automatic Discovery of Disease Subgroups by Contrasting with Healthy Controls
Deep UCSL uses a contrastive EM loss on patient-control labels to isolate disease-driven subgroups in medical imaging by suppressing shared healthy variability.
-
Multi-Granularity Reasoning for Image Quality Assessment via Attribute-Aware Reinforcement Learning to Rank
MG-IQA trains vision-language models with attribute-aware RL2R and a multi-dimensional Thurstone reward model to jointly predict overall quality and fine-grained attributes, reporting 2.1% average SRCC gains on eight IQA benchmarks.
-
RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent
RealLiFe optimizes multi-plane images with HSGD to deliver real-time light field reconstruction from sparse views, claiming 100x speedup over offline methods and 2 dB PSNR gain over online ones.
-
Error Correction for Discrete Tomography
Fewer than d/2 errors in line sums can be corrected in discrete tomography, with the bound shown to be optimal.
-
What neurosurgeons need to see: synthetic intra-operative MRI from ultrasound for brain-shift compensation in brain tumour surgery
End-to-end pipeline uses ResViT-2.5D to synthesize post-resection MRI from ioUS then anchors deformable registration, yielding 5.86 mm TRE on 14 ReMIND subjects while producing an integrated whole-brain volume reflecting intraoperative state.
-
Mathematical framework for perception-driven parameter choice in image denoising
Authors create psychometrically scaled image sets from human tests on denoised photos and provide a HaarPSI threshold for choosing denoising parameters based on perceived similarity.
-
Hyperspectral Image Classification via Efficient Global Spectral Supertoken Clustering
DSCC groups spectrally similar and spatially close pixels into supertokens using multi-criteria distance and soft labels, then classifies at the token level to achieve 0.728 CF1 at 197.75 FPS on WHU-OHS.
-
RoomRecon: High-Quality Textured Room Layout Reconstruction on Mobile Devices
RoomRecon delivers a real-time mobile system for high-quality textured 3D room reconstructions that combines AR-guided imaging with generative AI texturing focused on permanent structures and claims to outperform prior methods in quality and speed.
-
DAT: Dual-Aware Adaptive Transmission for Efficient Multimodal LLM Inference in Edge-Cloud Systems
DAT combines a small-large model cascade with fine-tuning and bandwidth-aware multi-stream transmission to deliver high-accuracy event recognition and low-latency alerts for video streams in edge-cloud systems.
-
Multivariate Pointwise Information-Driven Data Sampling and Visualization
A pointwise multivariate information-driven sampling method generates reduced datasets that preserve statistical associations among variables for effective feature queries and analysis.
-
Aesthetic Attributes Assessment of Images
The paper proposes the Aesthetic Multi-Attribute Network (AMAN) that jointly predicts captions and scores for five aesthetic attributes using a new weakly-labeled dataset created via knowledge transfer.
-
Developing a Strong Pre-Trained Base Model for Plant Leaf Disease Classification
A DenseNet201 base model trained on a constructed plant leaf disease dataset outperforms baselines and enables faster, more robust transfer learning with less data than general models.
-
Looking Beyond the Obvious: A Survey on Abstract Concept Recognition for Video Understanding
A literature survey on abstract concept recognition in videos that catalogs prior tasks and datasets while advocating for foundation models and reuse of decades of community experience.
-
Monitoring road infrastructures from satellite images in Greater Maputo
Object-oriented RGB pixel distribution analysis from satellite images classifies paved versus unpaved roads in Greater Maputo.
-
RGB-D image-based Object Detection: from Traditional Methods to Deep Learning Techniques
A survey of RGB-D object detection from traditional hand-crafted features with machine learning to deep learning techniques.