Recognition: unknown
Lorentz Framework for Semantic Segmentation
Pith reviewed 2026-05-10 07:05 UTC · model grok-4.3
The pith
Placing semantic segmentation in the Lorentz hyperbolic model allows stable training and free uncertainty estimates while integrating with standard Euclidean networks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a novel, tractable, architecture-agnostic semantic segmentation framework (pixel-wise and mask classification) in the hyperbolic Lorentz model. We employ text embeddings with semantic and visual cues to guide hierarchical pixel-level representations in Lorentz space. This enables stable and efficient optimization without requiring a Riemannian optimizer, and easily integrates with existing Euclidean architectures. Beyond segmentation, our approach yields free uncertainty estimation, confidence map, boundary delineation, hierarchical and text-based retrieval, and zero-shot performance, reaching generalized flatter minima.
What carries the argument
Lorentz model cone embeddings for pixels and masks, guided by text embeddings, that encode hierarchical structure and supply uncertainty as a byproduct of the geometry.
If this is right
- Stable and efficient optimization proceeds with standard Euclidean optimizers rather than Riemannian ones.
- The method integrates directly with existing per-pixel and mask-classification architectures without redesign.
- Uncertainty estimation, confidence maps, and boundary delineation arise without additional modules.
- Hierarchical retrieval and zero-shot transfer become available through the same embeddings.
- Experiments across ADE20K, COCO-Stuff-164k, Pascal-VOC, and Cityscapes with DeepLabV3, SegFormer, Mask2Former, and MaskFormer confirm the pattern.
Where Pith is reading between the lines
- The same Lorentz embedding change could be tested on detection or instance segmentation where class hierarchies also matter.
- Text-guided cues in Lorentz space might improve robustness to domain shift by aligning visual and linguistic hierarchies.
- Gradient analysis of Lorentz optimization could be reused to diagnose training dynamics in other non-Euclidean vision models.
Load-bearing premise
That switching to the Lorentz model automatically supplies hierarchical structure, numerical stability, and free uncertainty quantification while preserving the benefits of hyperbolic geometry when the rest of the network remains Euclidean.
What would settle it
Training the Lorentz framework on ADE20K or Cityscapes and finding no measurable gain in boundary precision or uncertainty calibration relative to an otherwise identical Euclidean baseline would falsify the central claim.
Figures
read the original abstract
Semantic segmentation in hyperbolic space enables compact modeling of hierarchical structure while providing inherent uncertainty quantification. Prior approaches predominantly rely on the Poincar\'e ball model, which suffers from numerical instability, optimization, and computational challenges. We propose a novel, tractable, architecture-agnostic semantic segmentation framework (pixel-wise and mask classification) in the hyperbolic Lorentz model. We employ text embeddings with semantic and visual cues to guide hierarchical pixel-level representations in Lorentz space. This enables stable and efficient optimization without requiring a Riemannian optimizer, and easily integrates with existing Euclidean architectures. Beyond segmentation, our approach yields free uncertainty estimation, confidence map, boundary delineation, hierarchical and text-based retrieval, and zero-shot performance, reaching generalized flatter minima. We introduce a novel uncertainty and confidence indicator in Lorentz cone embeddings. Further, we provide analytical and empirical insights into Lorentz optimization via gradient analysis. Extensive experiments on ADE20K, COCO-Stuff-164k, Pascal-VOC, and Cityscapes, utilizing state-of-the-art per-pixel classification models (DeepLabV3 and SegFormer) and mask classification models (mask2former and maskformer), validate the effectiveness and generality of our approach. Our results demonstrate the potential of hyperbolic Lorentz embeddings for robust and uncertainty-aware semantic segmentation. Code is available at https://github.com/mxahan/Lorentz_semantic_segmentation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a novel, architecture-agnostic semantic segmentation framework (for both pixel-wise and mask classification) that embeds features into the hyperbolic Lorentz model while retaining Euclidean backbones. It claims that text embeddings guide hierarchical pixel-level representations, enabling stable optimization without Riemannian optimizers, easy integration with models such as DeepLabV3, SegFormer, MaskFormer, and Mask2Former, and 'free' benefits including uncertainty estimation, confidence maps, boundary delineation, hierarchical/text-based retrieval, zero-shot performance, and generalized flatter minima. The approach is evaluated on ADE20K, COCO-Stuff-164k, Pascal-VOC, and Cityscapes.
Significance. If the central claims hold, the work would be significant for practical deployment of hyperbolic geometry in computer vision: it targets the numerical and optimization drawbacks of the Poincaré ball while preserving hierarchical modeling and adding uncertainty quantification without auxiliary heads or special optimizers. The architecture-agnostic integration and multi-task outputs (retrieval, zero-shot) could broaden adoption of non-Euclidean embeddings in segmentation pipelines.
major comments (2)
- [Abstract and Methods] The central claim that embedding only the final pixel/mask head into the Lorentz model (while the backbone remains Euclidean) automatically supplies hierarchical structure, numerical stability, and free uncertainty quantification is load-bearing yet unsupported by any derivation of the projection operator, the exact Minkowski-space classification loss, or the uncertainty indicator (e.g., whether it derives from the Lorentz inner product or cone aperture versus an auxiliary computation). This must be shown explicitly, as a standard MLP-plus-normalization step would reduce the geometry to a metric change without the advertised benefits.
- [Abstract and Experiments] The abstract asserts effectiveness across four datasets and four architectures yet supplies no quantitative results, ablation tables, or derivation steps for the claimed 'free uncertainty' and 'generalized flatter minima'; without these, the empirical validation of the framework's generality and the geometric advantages cannot be assessed.
minor comments (1)
- Clarify the precise form of the text-embedding guidance and how it interacts with the Lorentz cone embeddings; the current description leaves the integration mechanism underspecified.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments point-by-point below and will revise the manuscript to provide the requested explicit derivations and additional empirical details.
read point-by-point responses
-
Referee: [Abstract and Methods] The central claim that embedding only the final pixel/mask head into the Lorentz model (while the backbone remains Euclidean) automatically supplies hierarchical structure, numerical stability, and free uncertainty quantification is load-bearing yet unsupported by any derivation of the projection operator, the exact Minkowski-space classification loss, or the uncertainty indicator (e.g., whether it derives from the Lorentz inner product or cone aperture versus an auxiliary computation). This must be shown explicitly, as a standard MLP-plus-normalization step would reduce the geometry to a metric change without the advertised benefits.
Authors: We agree that the derivations must be presented more explicitly. The manuscript already derives the Euclidean-to-Lorentz projection in Section 3.2 (using the standard hyperboloid embedding formula) and defines the classification loss via the Minkowski inner product in Equation (5). The uncertainty indicator is derived in Section 4.1 directly from the Lorentz cone aperture (norm of the time-like component) without auxiliary heads. However, to strengthen the presentation, we will add a dedicated subsection with full step-by-step derivations, including the projection operator, the exact loss, and a short proof that the uncertainty and hierarchical properties arise from the geometry rather than a simple metric rescaling. This will clarify why the benefits are not reducible to an MLP-plus-normalization step. revision: yes
-
Referee: [Abstract and Experiments] The abstract asserts effectiveness across four datasets and four architectures yet supplies no quantitative results, ablation tables, or derivation steps for the claimed 'free uncertainty' and 'generalized flatter minima'; without these, the empirical validation of the framework's generality and the geometric advantages cannot be assessed.
Authors: The full manuscript already contains quantitative results (Tables 1–4) and ablations (Section 5.3) across ADE20K, COCO-Stuff, Pascal-VOC, and Cityscapes with DeepLabV3, SegFormer, Mask2Former, and MaskFormer. The abstract is intentionally concise, but we will revise it to include a few key mIoU numbers and will expand Section 5 with a new ablation table isolating the contribution of the Lorentz head to uncertainty and optimization. We will also add the requested derivation steps for free uncertainty (from the cone geometry) and the gradient analysis supporting flatter minima (already present in Section 3.4 but now cross-referenced to experiments). revision: partial
Circularity Check
No circularity: Lorentz framework presented as independent construction with external validation
full rationale
The paper introduces a new architecture-agnostic framework for semantic segmentation in the Lorentz model, using text embeddings for guidance and a novel uncertainty indicator derived from cone embeddings. No load-bearing steps reduce claimed benefits (stability without Riemannian optimizers, free uncertainty, hierarchical structure) to self-defined parameters, fitted inputs renamed as predictions, or self-citation chains. The derivation relies on standard properties of the Lorentz model and integration with existing Euclidean backbones (DeepLabV3, SegFormer, etc.), validated empirically on ADE20K, COCO-Stuff, Pascal-VOC, and Cityscapes. The central claims remain independent of the paper's own fitted values or prior author results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Hyperbolic space provides compact hierarchical modeling superior to Euclidean space for semantic segmentation
- domain assumption Text embeddings can reliably guide pixel-level hyperbolic representations
Reference graph
Works this paper leans on
-
[1]
Tree-like structure in large social and information networks
Aaron B Adcock, Blair D Sullivan, and Michael W Mahoney. Tree-like structure in large social and information networks. In2013 IEEE 13th international conference on data mining, pages 1–10. IEEE, 2013. 4
2013
-
[2]
Hyperbolic image segmenta- tion
Mina Ghadimi Atigh, Julian Schoep, Erman Acar, Nanne Van Noord, and Pascal Mettes. Hyperbolic image segmenta- tion. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4453–4462, 2022. 2, 3, 4, 6, 7, 9, 10, 12
2022
-
[3]
Springer Science & Business Media, 2013
Martin R Bridson and Andr ´e Haefliger.Metric spaces of non-positive curvature, volume 319. Springer Science & Business Media, 2013. 4
2013
-
[4]
Coco- stuff: Thing and stuff classes in context
Holger Caesar, Jasper Uijlings, and Vittorio Ferrari. Coco- stuff: Thing and stuff classes in context. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1209–1218, 2018. 5, 8, 10
2018
-
[5]
Hyperbolic geometry.Flavors of geometry, 31(59-115):2, 1997
James W Cannon, William J Floyd, Richard Kenyon, Walter R Parry, et al. Hyperbolic geometry.Flavors of geometry, 31(59-115):2, 1997. 3
1997
-
[6]
End-to- end object detection with transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- end object detection with transformers. InEuropean confer- ence on computer vision, pages 213–229. Springer, 2020. 2, 8
2020
-
[7]
Hyperbolic uncertainty aware semantic segmentation
Bike Chen, Wei Peng, Xiaofeng Cao, and Juha R ¨oning. Hyperbolic uncertainty aware semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 25(2):1275–1290, 2023. 2, 3, 10
2023
-
[8]
Rethinking Atrous Convolution for Semantic Image Segmentation
Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. Rethinking atrous convolution for semantic image segmentation.arXiv preprint arXiv:1706.05587, 2017. 2, 5, 8
work page internal anchor Pith review arXiv 2017
-
[9]
Encoder-decoder with atrous separable convolution for semantic image segmentation
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018. 1
2018
-
[10]
Masked-attention mask trans- former for universal image segmentation
Bowen Cheng, Ishan Misra, Alexander G Schwing, Alexander Kirillov, and Rohit Girdhar. Masked-attention mask trans- former for universal image segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1290–1299, 2022. 1, 2, 8
2022
-
[11]
Per-pixel classification is not all you need for semantic segmentation.Advances in neural information processing systems, 34:17864–17875, 2021
Bowen Cheng, Alex Schwing, and Alexander Kirillov. Per-pixel classification is not all you need for semantic segmentation.Advances in neural information processing systems, 34:17864–17875, 2021. 1, 2, 8
2021
-
[12]
Cat-seg: Cost aggregation for open-vocabulary semantic segmentation
Seokju Cho, Heeseong Shin, Sunghwan Hong, Anurag Arnab, Paul Hongsuck Seo, and Seungryong Kim. Cat-seg: Cost aggregation for open-vocabulary semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4113–4123, 2024. 3
2024
-
[13]
The cityscapes dataset for semantic urban scene understanding
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016. 5, 8, 10
2016
-
[14]
Mta-clip: Language-guided semantic segmentation with mask-text alignment
Anurag Das, Xinting Hu, Li Jiang, and Bernt Schiele. Mta-clip: Language-guided semantic segmentation with mask-text alignment. InEuropean Conference on Computer Vision, pages 39–56. Springer, 2024. 1, 3
2024
-
[15]
Hyperbolic image-text representations
Karan Desai, Maximilian Nickel, Tanmay Rajpurohit, Justin Johnson, and Shanmukha Ramakrishna Vedantam. Hyperbolic image-text representations. InInternational Conference on Machine Learning, pages 7694–7731. PMLR, 2023. 1, 2, 3, 4, 6, 9 13
2023
-
[16]
Sharp minima can generalize for deep nets
Laurent Dinh, Razvan Pascanu, Samy Bengio, and Yoshua Bengio. Sharp minima can generalize for deep nets. In International Conference on Machine Learning, pages 1019–1028. PMLR, 2017. 12
2017
-
[17]
Adapting auxiliary losses using gradient similarity
Yunshu Du, Wojciech M Czarnecki, Siddhant M Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, and Balaji Lakshmi- narayanan. Adapting auxiliary losses using gradient similarity. arXiv preprint arXiv:1812.02224, 2018. 7
-
[18]
The pascal visual object classes (voc) challenge.International journal of computer vision, 88(2):303–338, 2010
Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge.International journal of computer vision, 88(2):303–338, 2010. 5, 8, 10
2010
-
[19]
Com- puting the gromov hyperbolicity of a discrete metric space
Herv´e Fournier, Anas Ismail, and Antoine Vigneron. Com- puting the gromov hyperbolicity of a discrete metric space. Information Processing Letters, 115(6-8):576–579, 2015. 4
2015
-
[20]
Luca Franco, Paolo Mandica, Konstantinos Kallidromitis, Devin Guillory, Yu-Teng Li, Trevor Darrell, and Fabio Galasso. Hyperbolic active learning for semantic segmentation under domain shift.arXiv preprint arXiv:2306.11180, 2023. 3
-
[21]
Luca Franco, Paolo Mandica, Bharti Munjal, and Fabio Galasso. Hyperbolic self-paced learning for self-supervised skeleton-based action representations.arXiv preprint arXiv:2303.06242, 2023. 8
-
[22]
Hyperbolic entailment cones for learning hierarchical embed- dings
Octavian Ganea, Gary B ´ecigneul, and Thomas Hofmann. Hyperbolic entailment cones for learning hierarchical embed- dings. InInternational conference on machine learning, pages 1646–1655. PMLR, 2018. 2, 3, 6
2018
-
[23]
Hyperbolic neural networks.Advances in neural information processing systems, 31, 2018
Octavian Ganea, Gary B ´ecigneul, and Thomas Hofmann. Hyperbolic neural networks.Advances in neural information processing systems, 31, 2018. 7
2018
-
[24]
Hyperbolic groups
Mikhael Gromov. Hyperbolic groups. InEssays in group theory, pages 75–263. Springer, 1987. 4
1987
-
[25]
Lorentz entailment cone for semantic segmentation
Zahid Hasan, Masud Ahmed, and Nirmalya Roy. Lorentz entailment cone for semantic segmentation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5216–5225, 2026. 2, 3
2026
-
[26]
Mask r-cnn
Kaiming He, Georgia Gkioxari, Piotr Doll ´ar, and Ross Gir- shick. Mask r-cnn. InProceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017. 2
2017
-
[27]
Taxonomy-aware continual semantic segmentation in hyper- bolic spaces for open-world perception.IEEE Robotics and Automation Letters, 2024
Julia Hindel, Daniele Cattaneo, and Abhinav Valada. Taxonomy-aware continual semantic segmentation in hyper- bolic spaces for open-world perception.IEEE Robotics and Automation Letters, 2024. 3
2024
-
[28]
Intriguing properties of hyperbolic embeddings in vision-language models
Sarah Ibrahimi, Mina Ghadimi Atigh, Nanne Van Noord, Pascal Mettes, and Marcel Worring. Intriguing properties of hyperbolic embeddings in vision-language models. Transactions on Machine Learning Research, 2024. 2
2024
-
[29]
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. On large-batch training for deep learning: Generalization gap and sharp minima.arXiv preprint arXiv:1609.04836, 2016. 12
work page internal anchor Pith review arXiv 2016
-
[30]
Hyperbolic image embeddings
Valentin Khrulkov, Leyla Mirvakhabova, Evgeniya Ustinova, Ivan Oseledets, and Victor Lempitsky. Hyperbolic image embeddings. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6418–6428,
-
[31]
Probabilistic prompt learning for dense prediction
Hyeongjun Kwon, Taeyong Song, Somi Jeong, Jin Kim, Jinhyun Jang, and Kwanghoon Sohn. Probabilistic prompt learning for dense prediction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6768–6777, 2023. 3
2023
-
[32]
Exploring simple open-vocabulary semantic segmentation
Zihang Lai. Exploring simple open-vocabulary semantic segmentation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 30221–30230, 2025. 3
2025
-
[33]
Lorentzian distance learning for hyperbolic representations
Marc Law, Renjie Liao, Jake Snell, and Richard Zemel. Lorentzian distance learning for hyperbolic representations. InInternational Conference on Machine Learning, pages 3672–3681. PMLR, 2019. 3
2019
-
[34]
Matt Le, Stephen Roller, Laetitia Papaxanthos, Douwe Kiela, and Maximilian Nickel. Inferring concept hierarchies from text corpora via hyperbolic embeddings.arXiv preprint arXiv:1902.00913, 2019. 3
-
[35]
Scalable multitask learning using gradient-based estimation of task affinity
Dongyue Li, Aneesh Sharma, and Hongyang R Zhang. Scalable multitask learning using gradient-based estimation of task affinity. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1542–1553, 2024. 8
2024
-
[36]
Visualizing the loss landscape of neural nets.Ad- vances in neural information processing systems, 31, 2018
Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. Visualizing the loss landscape of neural nets.Ad- vances in neural information processing systems, 31, 2018. 12
2018
-
[37]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014. 8
2014
-
[38]
Swin transformer: Hierarchical vision transformer using shifted windows
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021. 8
2021
-
[39]
Hyperbolic deep learning in computer vision: A survey.International Journal of Computer Vision, 132(9):3484–3508, 2024
Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, and Serena Yeung. Hyperbolic deep learning in computer vision: A survey.International Journal of Computer Vision, 132(9):3484–3508, 2024. 3
2024
-
[40]
Wordnet: a lexical database for english
George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995. 3, 5
1995
-
[41]
The numerical stability of hyperbolic representation learning
Gal Mishne, Zhengchao Wan, Yusu Wang, and Sheng Yang. The numerical stability of hyperbolic representation learning. InInternational Conference on Machine Learning, pages 24925–24949. PMLR, 2023. 3
2023
-
[42]
Hyperbolic u-net for robust medical image seg- mentation
Swasti Shreya Mishra, Max van Spengler, Erwin Berkhout, and Pascal Mettes. Hyperbolic u-net for robust medical image seg- mentation. InMedical Imaging with Deep Learning, 2026. 3
2026
-
[43]
Poincar´e embeddings for learning hierarchical representations.Advances in neural information processing systems, 30, 2017
Maximillian Nickel and Douwe Kiela. Poincar´e embeddings for learning hierarchical representations.Advances in neural information processing systems, 30, 2017. 2, 3, 10
2017
-
[44]
Learning continuous hierarchies in the lorentz model of hyperbolic geometry
Maximillian Nickel and Douwe Kiela. Learning continuous hierarchies in the lorentz model of hyperbolic geometry. InInternational conference on machine learning, pages 3779–3788. PMLR, 2018. 2, 3, 10, 11
2018
-
[45]
Hyperbolic deep neural networks: A survey.IEEE Transactions on pattern analysis and machine intelligence, 44(12):10023–10044, 2021
Wei Peng, Tuomas Varanka, Abdelrahman Mostafa, Henglin Shi, and Guoying Zhao. Hyperbolic deep neural networks: A survey.IEEE Transactions on pattern analysis and machine intelligence, 44(12):10023–10044, 2021. 3, 4
2021
-
[46]
Understanding fine-tuning clip for open-vocabulary semantic segmentation in hyperbolic space
Zelin Peng, Zhengqin Xu, Zhilin Zeng, Changsong Wen, Yu Huang, Menglin Yang, Feilong Tang, and Wei Shen. Understanding fine-tuning clip for open-vocabulary semantic segmentation in hyperbolic space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4562–4572, 2025. 3 14
2025
-
[47]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR,
-
[48]
Denseclip: Language-guided dense prediction with context-aware prompt- ing
Yongming Rao, Wenliang Zhao, Guangyi Chen, Yansong Tang, Zheng Zhu, Guan Huang, Jie Zhou, and Jiwen Lu. Denseclip: Language-guided dense prediction with context-aware prompt- ing. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18082–18091, 2022. 1, 3
2022
-
[49]
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers and Iryna Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks.arXiv preprint arXiv:1908.10084, 2019. 5
work page internal anchor Pith review arXiv 1908
-
[50]
U-net: Convolutional networks for biomedical image segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer,
-
[51]
Representation tradeoffs for hyperbolic embeddings
Frederic Sala, Chris De Sa, Albert Gu, and Christopher R´e. Representation tradeoffs for hyperbolic embeddings. InInternational conference on machine learning, pages 4460–4469. PMLR, 2018. 3
2018
-
[52]
Low distortion delaunay embedding of trees in hyperbolic plane
Rik Sarkar. Low distortion delaunay embedding of trees in hyperbolic plane. InInternational symposium on graph drawing, pages 355–366. Springer, 2011. 3
2011
-
[53]
Order -embeddings of images and language[J]
Ivan Vendrov, Ryan Kiros, Sanja Fidler, and Raquel Urtasun. Order-embeddings of images and language.arXiv preprint arXiv:1511.06361, 2015. 2, 3, 11
-
[54]
Spot: Better frozen model adaptation through soft prompt transfer
Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, and Daniel Cer. Spot: Better frozen model adaptation through soft prompt transfer. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5039–5059, 2022. 7
2022
-
[55]
Vison transformer adapter-based hyperbolic embed- dings for multi-lesion segmentation in diabetic retinopathy
Zijian Wang, Haimei Lu, Haixin Yan, Hongxing Kan, and Li Jin. Vison transformer adapter-based hyperbolic embed- dings for multi-lesion segmentation in diabetic retinopathy. Scientific reports, 13(1):11178, 2023. 3
2023
-
[56]
Flattening the parent bias: Hierarchical semantic segmentation in the poincar ´e ball
Simon Weber, Bar Z ¨ong¨ur, Nikita Araslanov, and Daniel Cremers. Flattening the parent bias: Hierarchical semantic segmentation in the poincar ´e ball. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 28223–28232, 2024. 2, 3, 7
2024
-
[57]
Unsupervised discovery of the long-tail in instance segmentation using hierarchical self-supervision
Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, and Serena Yeung. Unsupervised discovery of the long-tail in instance segmentation using hierarchical self-supervision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2603–2612, 2021. 3
2021
-
[58]
Image-text co- decomposition for text-supervised semantic segmentation
Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang, Chun-Pei Chen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Yung-Yu Chuang, and Yen-Yu Lin. Image-text co- decomposition for text-supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 26794–26803, 2024. 1, 3
2024
-
[59]
Semantic projection network for zero-and few-label semantic segmentation
Yongqin Xian, Subhabrata Choudhury, Yang He, Bernt Schiele, and Zeynep Akata. Semantic projection network for zero-and few-label semantic segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8256–8265, 2019. 12
2019
-
[60]
Segformer: Simple and efficient design for semantic segmentation with transformers.Advances in neural information processing systems, 34:12077–12090,
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers.Advances in neural information processing systems, 34:12077–12090,
-
[61]
Hyperbolic fine-tuning for large language models.arXiv preprint arXiv:2410.04010, 2024
Menglin Yang, Aosong Feng, Bo Xiong, Jihong Liu, Irwin King, and Rex Ying. Hyperbolic fine-tuning for large language models.arXiv preprint arXiv:2410.04010, 2024. 3
-
[62]
A simple framework for text-supervised semantic segmentation
Muyang Yi, Quan Cui, Hao Wu, Cheng Yang, Osamu Yoshie, and Hongtao Lu. A simple framework for text-supervised semantic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7071–7080, 2023. 3
2023
-
[63]
Gradient surgery for multi- task learning.Advances in neural information processing systems, 33:5824–5836, 2020
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi- task learning.Advances in neural information processing systems, 33:5824–5836, 2020. 7
2020
-
[64]
Ifseg: Image-free semantic segmentation via vision-language model
Sukmin Yun, Seong Hyeon Park, Paul Hongsuck Seo, and Jinwoo Shin. Ifseg: Image-free semantic segmentation via vision-language model. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2967–2977, 2023. 1, 3
2023
-
[65]
Open vocabulary scene parsing
Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, and Anto- nio Torralba. Open vocabulary scene parsing. InProceedings of the IEEE International Conference on Computer Vision, pages 2002–2010, 2017. 3
2002
-
[66]
Semantic understand- ing of scenes through the ade20k dataset.International Journal of Computer Vision, 127(3):302–321, 2019
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, and Antonio Torralba. Semantic understand- ing of scenes through the ade20k dataset.International Journal of Computer Vision, 127(3):302–321, 2019. 5, 8, 10 15 Derivative Computations for Various Functions Dot Product Given: •x= [x 1,x2,...,xd] •y i = [yi1,yi2,...,yid] •f i =x·y ...
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.