CrackGeoFM is a multi-task framework that adapts a frozen visual foundation model with FCEM, CFAM, and SMTD modules for crack mask prediction, skeleton reconstruction, and uncertainty estimation, reporting SOTA results across 20 datasets including few-shot settings.
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 6years
2026 6verdicts
UNVERDICTED 6representative citing papers
EAPFusion uses self-evolving intrinsic priors to produce dynamic, scene-adaptive convolution kernels and channel-mixing fusion for infrared-visible images, reporting state-of-the-art results and downstream gains.
Deep CNNs with spatial continuity preservation and a new weighted loss function outperform Random Forest in cross-regional transfer for satellite-derived bathymetry, achieving low RMSE on independent tests and a public benchmark.
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
Invariant and equivariant semi-supervised learning improves multi-task detection and segmentation performance on partially labeled vision datasets compared to supervised baselines.
Fine-tunes SegFormer-B0 and B1 on FoodSeg103 for ingredient segmentation, reporting mIoU of 0.2521 and 0.3204, then derives ingredient area percentages for nutrition awareness.
citing papers explorer
-
Multi-Task Crack Foundation Model for Engineering-Reliable Crack Representation and Topology Preservation in Civil Infrastructure
CrackGeoFM is a multi-task framework that adapts a frozen visual foundation model with FCEM, CFAM, and SMTD modules for crack mask prediction, skeleton reconstruction, and uncertainty estimation, reporting SOTA results across 20 datasets including few-shot settings.
-
EAPFusion: Intrinsic Evolving Auxiliary Prior Guidance for Infrared and Visible Image Fusion
EAPFusion uses self-evolving intrinsic priors to produce dynamic, scene-adaptive convolution kernels and channel-mixing fusion for infrared-visible images, reporting state-of-the-art results and downstream gains.
-
From Local Training to Large-Scale Mapping: A Comparative Assessment of Machine Learning and Deep Learning for Transferable Satellite-Derived Bathymetry
Deep CNNs with spatial continuity preservation and a new weighted loss function outperform Random Forest in cross-regional transfer for satellite-derived bathymetry, achieving low RMSE on independent tests and a public benchmark.
-
Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
-
Multi-task learning on partially labeled datasets via invariant/equivariant semi-supervised learning
Invariant and equivariant semi-supervised learning improves multi-task detection and segmentation performance on partially labeled vision datasets compared to supervised baselines.
-
Ingredient-Level Food Image Segmentation for Nutrition Awareness
Fine-tunes SegFormer-B0 and B1 on FoodSeg103 for ingredient segmentation, reporting mIoU of 0.2521 and 0.3204, then derives ingredient area percentages for nutrition awareness.