hub

U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

Jun Ma, Feifei Li, Bo Wang · 2024 · eess.IV · arXiv 2401.04722

38 Pith papers cite this work. Polarity classification is still indexing.

38 Pith papers citing it

open full Pith review browse 38 citing papers arXiv PDF

abstract

Convolutional Neural Networks (CNNs) and Transformers have been the most popular architectures for biomedical image segmentation, but both of them have limited ability to handle long-range dependencies because of inherent locality or computational complexity. To address this challenge, we introduce U-Mamba, a general-purpose network for biomedical image segmentation. Inspired by the State Space Sequence Models (SSMs), a new family of deep sequence models known for their strong capability in handling long sequences, we design a hybrid CNN-SSM block that integrates the local feature extraction power of convolutional layers with the abilities of SSMs for capturing the long-range dependency. Moreover, U-Mamba enjoys a self-configuring mechanism, allowing it to automatically adapt to various datasets without manual intervention. We conduct extensive experiments on four diverse tasks, including the 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images. The results reveal that U-Mamba outperforms state-of-the-art CNN-based and Transformer-based segmentation networks across all tasks. This opens new avenues for efficient long-range dependency modeling in biomedical image analysis. The code, models, and data are publicly available at https://wanglab.ai/u-mamba.html.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 baseline 1

citation-polarity summary

background 3 baseline 1

representative citing papers

DyABD: The Abdominal Muscle Segmentation in Dynamic MRI Benchmark

cs.CV · 2026-04-25 · conditional · novelty 9.0

DyABD is the first benchmark dataset for abdominal muscle segmentation in dynamic MRIs featuring exercise-induced anatomical changes and pre/post-surgery scans, where existing models achieve an average Dice score of 0.82.

RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis

eess.IV · 2025-07-07 · unverdicted · novelty 8.0

Introduces RAM-W600, the first public multi-task dataset of wrist conventional radiographs with instance segmentation annotations and Sharp/van der Heijde bone erosion scores for rheumatoid arthritis research.

RAM-H1200: A Unified Evaluation and Dataset on Hand Radiographs for Rheumatoid Arthritis

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

RAM-H1200 introduces a public dataset of 1,200 hand X-rays with whole-hand bone segmentation, pixel-level bone erosion masks, and joint-level SvdH scores for both erosion and narrowing to enable unified RA analysis.

AG-TAL: Anatomically-Guided Topology-Aware Loss for Multiclass Segmentation of the Circle of Willis Using Large-Scale Multi-Center Datasets

cs.LG · 2026-04-30 · conditional · novelty 7.0

AG-TAL loss improves multiclass Circle of Willis segmentation to 80.85% average Dice with 1-3% gains on small arteries across multi-center datasets by embedding anatomical priors into topology-aware terms.

Camyla: Scaling Autonomous Research in Medical Image Segmentation

cs.AI · 2026-04-12 · unverdicted · novelty 7.0

Camyla autonomously generates research proposals, experiments, and manuscripts in medical image segmentation, outperforming baselines on 24 of 31 recent datasets while producing 40 human-reviewed papers.

Unsupervised Source-Free Ranking of Biomedical Segmentation Models Under Distribution Shift

cs.CV · 2025-03-01 · unverdicted · novelty 7.0

Presents the first unsupervised source-free framework for ranking semantic and instance segmentation models via prediction consistency under perturbations, with rankings correlating to target-domain performance across 2D/3D biomedical tasks.

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

cs.CV · 2024-01-17 · conditional · novelty 7.0

Vim is a bidirectional Mamba vision backbone that outperforms DeiT in accuracy on standard tasks while being substantially faster and more memory-efficient for high-resolution images.

BiSegMamba: Efficient Bidirectional Tri-Oriented Mamba for 3D Medical Image Segmentation

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

BiSegMamba is a bidirectional tri-oriented Mamba architecture that improves performance and reduces FLOPs in 3D medical image segmentation across brain, cardiac, abdominal, and vascular tasks.

Speech-Guided Multimodal Learning for Vocal Tract Segmentation in Real-Time MRI

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

A multimodal training pipeline with phonological bounding-box priors and cross-modal contrastive alignment transfers speech supervision to single-modality rtMRI vocal tract segmentation and outperforms prior methods on two datasets.

MambaPanoptic: A Vision Mamba-based Structured State Space Framework for Panoptic Segmentation

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

MambaPanoptic is a fully Mamba-based panoptic segmentation model that uses MambaFPN for multi-scale features and a QuadMamba kernel generator to outperform PanopticDeepLab and PanopticFCN on Cityscapes and COCO while using fewer parameters than Mask2Former.

EmambaIR: Efficient Visual State Space Model for Event-guided Image Reconstruction

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

EmambaIR is a visual state space model with cross-modal top-k sparse attention and gated SSM components that outperforms prior CNN and ViT methods on event-guided deblurring, deraining, and HDR reconstruction while reducing memory and compute costs.

SAMamba3D: adapting Segment Anything for generalizable 3D segmentation of multiphase pore-scale images

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

SAMamba3D adapts a frozen SAM encoder with Mamba volumetric context and cross-scale features to match or exceed 3D baselines on diverse sandstone and carbonate datasets while reducing case-specific retraining.

CrossPan: A Comprehensive Benchmark for Cross-Sequence Pancreas MRI Segmentation and Generalization

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

CrossPan benchmark shows cross-sequence MRI domain shifts cause pancreas segmentation models to fail catastrophically, establishing sequence generalization as the primary barrier to clinical deployment over center variability or architecture choices.

CloudMamba: An Uncertainty-Guided Dual-Scale Mamba Network for Cloud Detection in Remote Sensing Imagery

cs.CV · 2026-04-08 · unverdicted · novelty 6.0

CloudMamba combines uncertainty-guided refinement with a dual-scale Mamba network to outperform prior methods on cloud segmentation accuracy while maintaining linear computational cost.

Geometrical Cross-Attention and Nonvoid Voxelization for Efficient 3D Medical Image Segmentation

cs.CV · 2026-04-07 · unverdicted · novelty 6.0

GCNV-Net achieves state-of-the-art accuracy on multiple 3D medical segmentation benchmarks while cutting FLOPs by 56% and inference latency by 68% through dynamic nonvoid voxelization and geometric attention.

CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing

cs.CV · 2025-12-10 · unverdicted · novelty 6.0

The paper defines the Conformal Hallucination Estimation Metric (CHEM) that localizes hallucination-prone regions in image reconstruction models via multiscale representations and distribution-free conformal regression.

Differential-UMamba: Rethinking Tumor Segmentation Under Limited Data Scenarios

cs.CV · 2025-07-24 · unverdicted · novelty 6.0

Diff-UMamba combines UNet with Mamba and adds signal differencing for noise reduction, yielding 1-3% segmentation gains on public medical datasets and 4-5% on a small internal lung cancer dataset under limited data conditions.

COMMA: Coordinate-aware Modulated Mamba Network for 3D Dispersed Vessel Segmentation

eess.IV · 2025-03-04 · unverdicted · novelty 6.0

Presents COMMA, a coordinate-aware Mamba network for 3D vessel segmentation that uses global and local branches, along with a new 570-case labeled dataset.

Gated Linear Attention Transformers with Hardware-Efficient Training

cs.LG · 2023-12-11 · unverdicted · novelty 6.0

Gated linear attention Transformers achieve competitive language modeling results with linear-time inference, superior length generalization, and higher training throughput than Mamba.

RadGenome-Anatomy: A Large-Scale Anatomy-Labeled Chest Radiograph Dataset via Physically Grounded Volumetric Projection

cs.CV · 2026-05-17 · unverdicted · novelty 5.0

RadGenome-Anatomy is a large-scale chest radiograph dataset with anatomy labels obtained by projecting 3D CT masks into 2D radiographic space for 210 structures in 25,692 studies.

MHMamba: Multi-Head Mamba for 3D Brain Tumor Segmentation

cs.CV · 2026-05-15 · unverdicted · novelty 5.0

MHMamba combines a U-Net with multi-head Mamba, channel calibration, and adaptive skip fusion to improve 3D brain tumor segmentation accuracy and small-lesion sensitivity on BraTS datasets while retaining linear complexity.

USEMA: a Scalable Efficient Mamba Like Attention for Medical Image Segmentation

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

USEMA is a hybrid UNet architecture merging CNNs with scalable Mamba-like attention (SEMA) that achieves better efficiency than transformers and superior segmentation accuracy than pure CNN or Mamba models across medical imaging modalities.

TopoMamba: Topology-Aware Scanning and Fusion for Segmenting Heterogeneous Medical Visual Media

cs.CV · 2026-04-28 · unverdicted · novelty 5.0

TopoMamba improves medical image segmentation by combining topology-aware diagonal scans with standard cross-scans and a HSIC Gate for efficient fusion, yielding gains on thin and curved targets like the pancreas.

GroupKAN: Efficient Kolmogorov-Arnold Networks via Grouped Spline Modeling

cs.CV · 2025-11-07 · conditional · novelty 5.0

GroupKAN reduces KAN parameter scaling via intra-group spline mappings, delivering 79.80% average IoU (+1.11% over U-KAN) at 47.6% of the parameters on BUSI, GlaS, and CVC datasets.

citing papers explorer

Showing 38 of 38 citing papers.

DyABD: The Abdominal Muscle Segmentation in Dynamic MRI Benchmark cs.CV · 2026-04-25 · conditional · none · ref 10 · internal anchor
DyABD is the first benchmark dataset for abdominal muscle segmentation in dynamic MRIs featuring exercise-induced anatomical changes and pre/post-surgery scans, where existing models achieve an average Dice score of 0.82.
RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis eess.IV · 2025-07-07 · unverdicted · none · ref 44 · internal anchor
Introduces RAM-W600, the first public multi-task dataset of wrist conventional radiographs with instance segmentation annotations and Sharp/van der Heijde bone erosion scores for rheumatoid arthritis research.
RAM-H1200: A Unified Evaluation and Dataset on Hand Radiographs for Rheumatoid Arthritis cs.CV · 2026-05-07 · unverdicted · none · ref 39 · internal anchor
RAM-H1200 introduces a public dataset of 1,200 hand X-rays with whole-hand bone segmentation, pixel-level bone erosion masks, and joint-level SvdH scores for both erosion and narrowing to enable unified RA analysis.
AG-TAL: Anatomically-Guided Topology-Aware Loss for Multiclass Segmentation of the Circle of Willis Using Large-Scale Multi-Center Datasets cs.LG · 2026-04-30 · conditional · none · ref 31 · internal anchor
AG-TAL loss improves multiclass Circle of Willis segmentation to 80.85% average Dice with 1-3% gains on small arteries across multi-center datasets by embedding anatomical priors into topology-aware terms.
Camyla: Scaling Autonomous Research in Medical Image Segmentation cs.AI · 2026-04-12 · unverdicted · none · ref 48 · internal anchor
Camyla autonomously generates research proposals, experiments, and manuscripts in medical image segmentation, outperforming baselines on 24 of 31 recent datasets while producing 40 human-reviewed papers.
Unsupervised Source-Free Ranking of Biomedical Segmentation Models Under Distribution Shift cs.CV · 2025-03-01 · unverdicted · none · ref 45 · internal anchor
Presents the first unsupervised source-free framework for ranking semantic and instance segmentation models via prediction consistency under perturbations, with rankings correlating to target-domain performance across 2D/3D biomedical tasks.
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model cs.CV · 2024-01-17 · conditional · none · ref 46 · internal anchor
Vim is a bidirectional Mamba vision backbone that outperforms DeiT in accuracy on standard tasks while being substantially faster and more memory-efficient for high-resolution images.
BiSegMamba: Efficient Bidirectional Tri-Oriented Mamba for 3D Medical Image Segmentation cs.CV · 2026-05-29 · unverdicted · none · ref 12 · internal anchor
BiSegMamba is a bidirectional tri-oriented Mamba architecture that improves performance and reduces FLOPs in 3D medical image segmentation across brain, cardiac, abdominal, and vascular tasks.
Speech-Guided Multimodal Learning for Vocal Tract Segmentation in Real-Time MRI cs.CV · 2026-05-18 · unverdicted · none · ref 19 · internal anchor
A multimodal training pipeline with phonological bounding-box priors and cross-modal contrastive alignment transfers speech supervision to single-modality rtMRI vocal tract segmentation and outperforms prior methods on two datasets.
MambaPanoptic: A Vision Mamba-based Structured State Space Framework for Panoptic Segmentation cs.CV · 2026-05-12 · unverdicted · none · ref 27 · 2 links · internal anchor
MambaPanoptic is a fully Mamba-based panoptic segmentation model that uses MambaFPN for multi-scale features and a QuadMamba kernel generator to outperform PanopticDeepLab and PanopticFCN on Cityscapes and COCO while using fewer parameters than Mask2Former.
EmambaIR: Efficient Visual State Space Model for Event-guided Image Reconstruction cs.CV · 2026-05-08 · unverdicted · none · ref 29 · internal anchor
EmambaIR is a visual state space model with cross-modal top-k sparse attention and gated SSM components that outperforms prior CNN and ViT methods on event-guided deblurring, deraining, and HDR reconstruction while reducing memory and compute costs.
SAMamba3D: adapting Segment Anything for generalizable 3D segmentation of multiphase pore-scale images cs.CV · 2026-04-29 · unverdicted · none · ref 44 · internal anchor
SAMamba3D adapts a frozen SAM encoder with Mamba volumetric context and cross-scale features to match or exceed 3D baselines on diverse sandstone and carbonate datasets while reducing case-specific retraining.
CrossPan: A Comprehensive Benchmark for Cross-Sequence Pancreas MRI Segmentation and Generalization cs.CV · 2026-04-20 · unverdicted · none · ref 2 · internal anchor
CrossPan benchmark shows cross-sequence MRI domain shifts cause pancreas segmentation models to fail catastrophically, establishing sequence generalization as the primary barrier to clinical deployment over center variability or architecture choices.
CloudMamba: An Uncertainty-Guided Dual-Scale Mamba Network for Cloud Detection in Remote Sensing Imagery cs.CV · 2026-04-08 · unverdicted · none · ref 72 · internal anchor
CloudMamba combines uncertainty-guided refinement with a dual-scale Mamba network to outperform prior methods on cloud segmentation accuracy while maintaining linear computational cost.
Geometrical Cross-Attention and Nonvoid Voxelization for Efficient 3D Medical Image Segmentation cs.CV · 2026-04-07 · unverdicted · none · ref 36 · internal anchor
GCNV-Net achieves state-of-the-art accuracy on multiple 3D medical segmentation benchmarks while cutting FLOPs by 56% and inference latency by 68% through dynamic nonvoid voxelization and geometric attention.
CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing cs.CV · 2025-12-10 · unverdicted · none · ref 41 · internal anchor
The paper defines the Conformal Hallucination Estimation Metric (CHEM) that localizes hallucination-prone regions in image reconstruction models via multiscale representations and distribution-free conformal regression.
Differential-UMamba: Rethinking Tumor Segmentation Under Limited Data Scenarios cs.CV · 2025-07-24 · unverdicted · none · ref 13 · internal anchor
Diff-UMamba combines UNet with Mamba and adds signal differencing for noise reduction, yielding 1-3% segmentation gains on public medical datasets and 4-5% on a small internal lung cancer dataset under limited data conditions.
COMMA: Coordinate-aware Modulated Mamba Network for 3D Dispersed Vessel Segmentation eess.IV · 2025-03-04 · unverdicted · none · ref 31 · internal anchor
Presents COMMA, a coordinate-aware Mamba network for 3D vessel segmentation that uses global and local branches, along with a new 570-case labeled dataset.
Gated Linear Attention Transformers with Hardware-Efficient Training cs.LG · 2023-12-11 · unverdicted · none · ref 54 · internal anchor
Gated linear attention Transformers achieve competitive language modeling results with linear-time inference, superior length generalization, and higher training throughput than Mamba.
RadGenome-Anatomy: A Large-Scale Anatomy-Labeled Chest Radiograph Dataset via Physically Grounded Volumetric Projection cs.CV · 2026-05-17 · unverdicted · none · ref 26 · internal anchor
RadGenome-Anatomy is a large-scale chest radiograph dataset with anatomy labels obtained by projecting 3D CT masks into 2D radiographic space for 210 structures in 25,692 studies.
MHMamba: Multi-Head Mamba for 3D Brain Tumor Segmentation cs.CV · 2026-05-15 · unverdicted · none · ref 22 · internal anchor
MHMamba combines a U-Net with multi-head Mamba, channel calibration, and adaptive skip fusion to improve 3D brain tumor segmentation accuracy and small-lesion sensitivity on BraTS datasets while retaining linear complexity.
USEMA: a Scalable Efficient Mamba Like Attention for Medical Image Segmentation cs.CV · 2026-05-11 · unverdicted · none · ref 17 · internal anchor
USEMA is a hybrid UNet architecture merging CNNs with scalable Mamba-like attention (SEMA) that achieves better efficiency than transformers and superior segmentation accuracy than pure CNN or Mamba models across medical imaging modalities.
TopoMamba: Topology-Aware Scanning and Fusion for Segmenting Heterogeneous Medical Visual Media cs.CV · 2026-04-28 · unverdicted · none · ref 5 · internal anchor
TopoMamba improves medical image segmentation by combining topology-aware diagonal scans with standard cross-scans and a HSIC Gate for efficient fusion, yielding gains on thin and curved targets like the pancreas.
GroupKAN: Efficient Kolmogorov-Arnold Networks via Grouped Spline Modeling cs.CV · 2025-11-07 · conditional · none · ref 22 · internal anchor
GroupKAN reduces KAN parameter scaling via intra-group spline mappings, delivering 79.80% average IoU (+1.11% over U-KAN) at 47.6% of the parameters on BUSI, GlaS, and CVC datasets.
Dino U-Net: Exploiting High-Fidelity Dense Features from Foundation Models for Medical Image Segmentation cs.CV · 2025-08-28 · unverdicted · none · ref 22 · internal anchor
Dino U-Net combines a frozen DINOv3 backbone with an adapter and fidelity-aware projection module to achieve state-of-the-art medical image segmentation across seven public datasets.
FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution cs.CV · 2025-06-17 · unverdicted · none · ref 29 · internal anchor
FADPNet decomposes facial features into low- and high-frequency components processed by dedicated Mamba and CNN modules to balance quality and efficiency in face super-resolution.
EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond cs.CV · 2024-11-27 · unverdicted · none · ref 29 · internal anchor
EventCrab integrates frame and point networks with a joint representation space, SCL, and Hilbert-scan EPE to improve event-based action recognition by 5-7% on two datasets.
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge cs.CV · 2024-07-16 · accept · none · ref 54 · internal anchor
SegSTRONG-C provides a new benchmark where top models reach 0.9394 DSC and 0.9301 NSD on corrupted surgical tool segmentation tests, showing conventional techniques help but calling for more innovative robustness methods.
3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion cs.CV · 2024-04-10 · unverdicted · none · ref 21 · internal anchor
3DMambaComplete applies the Mamba model to point cloud completion via hyperpoint generation, spatial spreading, and mesh deformation, claiming better results than prior methods on benchmarks.
EnergyMamba: An Uncertainty-Aware Graph-Enhanced Selective State Space Model for Energy Consumption Prediction cs.AI · 2026-05-30 · unverdicted · none · ref 28 · internal anchor
EnergyMamba improves energy consumption prediction accuracy by about 5% and uncertainty quantification by about 6% over 15 baselines on four real-world US datasets by combining graph-enhanced Mamba with adaptive sequential conformalized quantile regression.
CoRE: Concept-Reasoning Expansion for Continual Brain Lesion Segmentation cs.CV · 2026-04-28 · unverdicted · none · ref 34 · internal anchor
CoRE aligns image tokens to a hierarchical concept library to simulate clinical reasoning for expert routing and demand-based growth in continual brain lesion segmentation, achieving SOTA on 12 tasks.
Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models cs.AI · 2026-04-13 · unverdicted · none · ref 26 · internal anchor
Vision foundation models quantify aleatoric uncertainty via feature diversity and singular value energy to enable uncertainty-aware data filtering and dynamic training optimization for improved medical image segmentation.
Enhancing Medical Image Segmentation via Heat Conduction Equation cs.CV · 2025-11-05 · unverdicted · none · ref 7 · internal anchor
Hybrid U-Mamba architecture with Heat Conduction Operators achieves DSC of 0.8719 on Abdomen CT dataset by simulating frequency-domain thermal diffusion.
Attention Is not Everything: Efficient Alternatives for Vision cs.CV · 2026-04-19 · unverdicted · none · ref 44 · internal anchor
A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.
A Survey of Mamba cs.LG · 2024-08-02 · unverdicted · none · ref 131 · internal anchor
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.
Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba cs.LG · 2025-03-22 · unverdicted · none · ref 75 · internal anchor
A survey tracing the evolution of state-space models like S4 and Mamba, their efficiency trade-offs, and applications in NLP, vision, and other domains.
SurgicalMamba: Dual-Path SSD with State Regramming for Online Surgical Phase Recognition cs.CV · 2026-05-14 · unreviewed · ref 12 · internal anchor
Adaptable Segmentation Pipeline for Diverse Brain Tumors with Radiomic-Guided Subtyping and Lesion-Wise Model Ensemble cs.CV · 2025-12-16 · unreviewed · ref 18 · internal anchor

U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer