LatentHDR generates structurally consistent panoramic HDR images by producing one scene latent with a diffusion backbone then deterministically mapping it to multiple exposure latents via a lightweight conditional head.
hub Mixed citations
U-Net: Convolutional Networks for Biomedical Image Segmentation
Mixed citation behavior. Most common role is background (43%).
abstract
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segme
co-cited works
representative citing papers
EchoXFlow is a new dataset of 37,125 beamspace echocardiography recordings with separable modalities, Doppler data, ECG, and clinical annotations that enables acquisition-aware learning not possible with standard scan-converted videos.
Influpaint uses generative diffusion models on image-encoded influenza data to produce realistic and diverse epidemic trajectories that match leading ensemble methods in accuracy.
VitaminP uses paired H&E-mIF data to train a model that transfers molecular boundary information, enabling accurate whole-cell segmentation directly from routine H&E histology across 34 cancer types.
A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
A U-Net-based ML pipeline reconstructs the complete phase field and quantized vortex charges in 2D Bose-Einstein condensates from density snapshots alone, using synthetic training data from projected Gross-Pitaevskii simulations.
Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.
Defines diffusion processes on implicit data manifolds via proximity-graph approximations to the infinitesimal generator and carré-du-champ operator, proves convergence in law to the continuous manifold process, and provides an Euler-Maruyama integrator validated on synthetic and MNIST manifolds.
A CNN-based discrete diffusion method refines sparse contours from segmentation masks using simplified denoising steps and minimal post-processing, outperforming baselines on small medical and environmental datasets while running 3.5 times faster.
A diffusion model trained on real radio galaxy images reconstructs high-fidelity interferometric observations from VLA, EHT, and ALMA simulations and outperforms CLEAN on gridded visibilities.
SemanticBridge provides a new 3D dataset for bridge component segmentation and quantifies sensor-induced domain gaps that drop model performance by up to 11.4% mIoU.
Standard visual diffusion models operating in pixel space can approximate solutions to the inscribed square, Steiner tree, and simple polygon problems.
A U-Net GAN reconstructs CMB T and E maps from Planck-like simulations with foregrounds and systematics, achieving under 1% error outside the Galactic region and demonstrating first-time correction for non-circular beams and asymmetric scans.
SinkSAM-Net uses topographic priors and SAM with coordinate-wise bounding box jittering to create pseudo-labels for iterative self-supervised training of an EfficientNetV2-UNet, reaching about 95% of fully supervised performance on sinkhole datasets.
Normalizing flows enable all-order QED corrections in lattice scalar QED in 2-4 dimensions with reduced variance and transferability from small to large lattices.
REPA-P aligns intermediate representations in diffusion models with physical states using first-principles PDE residuals to accelerate convergence and boost out-of-distribution robustness on PDE tasks.
SegRAG is a training-free retrieval-augmented framework that extracts class-specific point prompts from a filtered DINOv3 feature bank to boost SAM3 semantic segmentation performance on standard and agricultural benchmarks.
BTECF encodes retinal vessels as Bézier trees to enable targeted, parameter-level counterfactual interventions on vessel geometry for causal analysis of vascular diseases.
A dual-branch system using frequency edge cues and CLIP-based synthetic patch detection for accurate, resolution-independent image forgery localization.
GeoProto enriches appearance prototypes with geometric offsets from an ordinal shape branch to improve cross-domain few-shot medical image segmentation.
ABLE learns a spatially adaptive Parseval frame from data via an ancillary density to replace fixed bases in spectral neural operators for PDEs.
Implicit score matching trains diffusion models that successfully sample SU(3) Wilson gauge configurations on lattices, with a Hamiltonian-dynamics corrector needed for strong coupling.
Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.
A hybrid CNN-Transformer denoiser trained on synthetic spectra substantially reduces noise and improves stellar population recovery for low-S/N galaxy observations in controlled tests.
citing papers explorer
-
Deep Learning for MRI Slice Interpolation: The Critical Role of Problem Formulation
Reformulating the input to adjacent slices for deep learning MRI interpolation yields 58% SSIM gains and 10.1% improvement over linear baseline, with problem formulation outweighing architecture choice.
-
Vision Transformer-Conditioned UNet for Domain-Adaptive Semantic Segmentation
ViTC-UNet adapts frozen ViT representations to biomedical semantic segmentation by conditioning a UNet via learnable tokens and two-way attention decoding.
-
Scalable Active Metamaterials for Shape-Morphing
A hierarchical SAM framework decouples macroscale mesh optimization from microscale inverse design to enable fast scalable creation of aperiodic shape-morphing metamaterials.
-
Full-chip CMP modelling based on Fully Convolutional Network leveraging White Light Interferometry
A fully convolutional network trained separately on WLI and AFM data predicts full-chip post-CMP nanotopography at nanometer accuracy.
-
Flow matching for Sentinel-2 super-resolution: implementation, application, and implications
Flow matching achieves single-step pixel accuracy and 20-step perceptual quality for Sentinel-2 super-resolution, outperforming diffusion and Real-ESRGAN while enabling large-scale 2.5 m land-cover products.
-
End-to-end Automated Deep Neural Network Optimization for PPG-based Blood Pressure Estimation on Wearables
An end-to-end hardware-aware optimization pipeline produces DNNs for PPG-based blood pressure estimation with up to 7.99% lower error and 83x fewer parameters that fit on ultra-low-power SoCs like GAP8.
-
RASALoRE: Region Aware Spatial Attention with Location-based Random Embeddings for Weakly Supervised Anomaly Detection in Brain MRI Scans
A novel weakly supervised anomaly detection method for brain MRI that uses discriminative dual prompt tuning for pseudo masks and region-aware spatial attention with location-based random embeddings to achieve SOTA results with under 8 million parameters on BraTS and MSD datasets.
-
Accuracy Improvement of Cell Image Segmentation Using Feedback Former
Feedback Former improves cell image segmentation accuracy by feeding detailed feature maps back from near the output to lower transformer layers, outperforming non-feedback baselines with lower computational cost on three datasets.
-
Online Inference and Detection of Curbs in Partially Occluded Scenes with Sparse LIDAR
Real-time deep network approach on 2D LIDAR bird's-eye views for detecting visible and occluded curbs with post-processing tracking.
-
The Ethical Dilemma when (not) Setting up Cost-based Decision Rules in Semantic Segmentation
Defining egoistic and altruistic cost functions for class confusions in semantic segmentation changes precision, recall, and segment-wise error rates relative to standard MAP decisions.
-
Physics-guided Convolutional Neural Network for Domain Growth Prediction in Systems with Conserved Kinetics
An attention-based physics-guided CNN surrogate is trained to predict long-time microstructural evolution under the Cahn-Hilliard equation for both critical and off-critical mixtures while preserving composition and matching Lifshitz-Slyozov domain growth.
-
World Action Models: The Next Frontier in Embodied AI
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
-
Deep Learning-Based Segmentation of Peritoneal Cancer Index Regions from CT Imaging
nnU-Net segments rPCI regions on 62 CT scans with mean Dice 0.82, nearing inter-observer agreement of 0.88 and beating Swin UNETR at 0.76.
-
KAYRA: A Microservice Architecture for AI-Assisted Karyotyping with Cloud and On-Premise Deployment
KAYRA packages a cascade of EfficientNet-B5 + U-Net, Mask R-CNN, and ResNet-18 models into a microservice architecture that supports both cloud and on-premise deployment and reaches 98.91% segmentation accuracy in a pilot test on 459 chromosomes.
-
A Deep U-Net Framework for Flood Hazard Mapping Using Hydraulic Simulations of the Wupper Catchment
A U-Net surrogate model trained on hydraulic simulations predicts maximum water levels for flood hazard mapping in the Wupper catchment with results comparable to the original simulations.
-
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
-
Learning to count small and clustered objects with application to bacterial colonies
ACFamNet Pro reaches 9.64% mean normalized absolute error on bacterial colony images under 5-fold cross-validation, beating FamNet by 12.71%.
-
AI Approach for MRI-only Full-Spine Vertebral Segmentation and 3D Reconstruction in Paediatric Scoliosis
An AI pipeline using GAN-generated MRI-like images and U-Net segmentation produces automated 3D thoracolumbar spine reconstructions from MRI with 88% Dice score and reduces processing time from 1 hour to under 1 minute while preserving scoliosis deformity features.
-
DigiForest: Digital Analytics and Robotics for Sustainable Forestry
DigiForest integrates heterogeneous autonomous robots for data collection, automated tree trait extraction, a decision support system for growth forecasting, and autonomous harvesters for selective logging, with real-world tests in European forests.
-
AMO-ENE: Attention-based Multi-Omics Fusion Model for Outcome Prediction in Extra Nodal Extension and HPV-associated Oropharyngeal Cancer
An attention-based fusion model combining semi-supervised CT segmentation, radiomics, and clinical features predicts metastatic recurrence, overall survival, and disease-free survival in HPV+ oropharyngeal cancer with AUCs of 88.2%, 79.2%, and 78.1% on an internal cohort of 397 patients.
-
Uncertainty Estimation for Deep Reconstruction in Actuatic Disaster Scenarios with Autonomous Vehicles
Evidential Deep Learning outperforms other methods in accuracy, calibration, and speed for uncertainty-aware scalar field reconstruction in aquatic environments using autonomous vehicles.
-
SAGE-GAN: Towards Realistic and Robust Segmentation of Spatially Ordered Nanoparticles via Attention-Guided GANs
SAGE-GAN integrates a self-attention U-Net into a CycleGAN framework to generate realistic synthetic electron microscopy image-mask pairs that augment training data for nanoparticle segmentation without human labeling.
-
Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation
A U-Net-style CNN trained on synthetic multi-modal grasp data from the Genesis simulator enables a real quadruped robot to navigate to and precisely grasp objects in a loco-manipulation task.
-
Wildfire spread forecasting with Deep Learning
A deep learning framework forecasts final wildfire burned area extent from ignition-time data, with an ablation showing that a four-day pre- to five-day post-ignition temporal window improves F1 and IoU by nearly 5% over a single-day baseline on held-out Mediterranean test data.
-
Clinical utility of foundation models in musculoskeletal MRI for biomarker fidelity and predictive outcomes
Fine-tuned foundation models produce reliable MSK MRI biomarkers that support workload-reducing triage and calibrated 48-month prediction of knee replacement and incident OA.
-
Self-Adaptive 2D-3D Ensemble of Fully Convolutional Networks for Medical Image Segmentation
Self-adaptive 2D-3D FCN ensemble optimized by multiobjective evolution for prostate segmentation on PROMISE12 achieves top-10 ranking with smaller size than prior auto-designed models.
-
SAN: Scale-Aware Network for Semantic Segmentation of High-Resolution Aerial Images
SANet adds a re-sampling-based scale-aware module to semantic segmentation networks to better handle inconsistent object scales in aerial images.
-
Embedding Non-Distortive Cancelable Face Template Generation
Presents a non-distortive cancelable face template method via targeted image distortion that maintains identity signals for neural embedding models on MNIST and LFW data.
-
Blind Deblurring Using GANs
Modifications to GANs using non-local attention blocks, residual connections, combined losses, and edge feedback are proposed and tested for supervised blind image deblurring.
-
Machine Learning Techniques for Astrophysics and Cosmology: Lyman-$\alpha$ forest
Review of machine learning applications for analyzing Lyman-alpha forest observations to probe cosmology, reionization, and dark matter.
-
Machine Learning as a Transformative Tool for (Exo-)Planetary Science
The paper reviews ML applications for sequence modeling, pattern recognition, and generative Bayesian analysis to tackle heterogeneous data challenges in (exo)planetary science.
- Enhanced Ionization Charge Identification in the Short-Baseline Neutrino Program Neutrino Detectors with Deep Neural Networks
- REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations
- StereoPolicy: Improving Robotic Manipulation Policies via Stereo Perception
- TRAS: An Interactive Software for Tracing Tree Ring Cross Sections
- A theory of learning data statistics in diffusion models, from easy to hard