Recognition: no theorem link
OneViewAll: Semantic Prior Guided One-View 6D Pose Estimation for Novel Objects
Pith reviewed 2026-05-11 01:33 UTC · model grok-4.3
The pith
OneViewAll estimates 6D object poses from a single real RGB-D view by aligning reference and query images in a projection-equivariant space guided by hierarchical semantic priors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
OneViewAll performs pose estimation via a Project-and-Compare paradigm that directly aligns reference and query observations within a projection-equivariant space. It progressively integrates hierarchical semantic priors across three levels: category- and scene-level priors for efficient hypothesis initialization, object-level symmetry priors for geometry completion via mirror fusion, and patch-level priors for discriminative refinement. On the LINEMOD dataset this yields 92.5 percent ADD-0.1 accuracy using only one real reference view, substantially above the 52.6 percent of the prior single-view baseline.
What carries the argument
The Project-and-Compare paradigm that aligns reference and query observations directly in projection-equivariant space, driven by three-level hierarchical semantic priors (category/scene, symmetry, patch).
If this is right
- Pose estimation becomes feasible for novel objects in settings where CAD models and multi-view capture are impractical.
- Symmetric, textureless, and partially occluded objects can be handled by the symmetry-aware fusion step.
- Inference remains low-latency, supporting real-time applications on standard hardware.
- Consistent gains appear across LINEMOD, YCB-V, Real275, and Toyota-Light datasets.
Where Pith is reading between the lines
- The single-view constraint could lower data-collection costs in robotics and augmented-reality pipelines that must handle new objects on the fly.
- If the projection-equivariant alignment generalizes, the same prior hierarchy might extend to video-based tracking without retraining per frame.
- Replacing CAD rendering with learned priors opens the possibility of combining this method with large-scale image-only pretraining.
Load-bearing premise
Reliable extraction and integration of the three-level semantic priors is possible from a single RGB-D view without CAD models or multi-view data.
What would settle it
A controlled test on objects whose category or symmetry priors are deliberately removed or corrupted, measuring whether ADD-0.1 accuracy collapses to the level of the prior single-view baseline.
Figures
read the original abstract
In many practical 6D object pose estimation scenarios, we often have access to only a single real-world RGB-D reference view per object, typically without CAD models. Existing methods largely rely on explicit 3D models or multi-view data, which limits their scalability. To address this challenging single-reference model-free setting, we propose \textbf{OneViewAll}, a semantic-prior-guided framework that performs pose estimation via a novel Project-and-Compare paradigm. Instead of relying on computationally expensive CAD-based rendering, our method directly aligns reference and query observations within a projection-equivariant space. OneViewAll progressively integrates hierarchical semantic priors across three levels: (1) \textit{category- and scene-level} priors for efficient hypothesis initialization; (2) \textit{object-level symmetry} priors for geometry completion via mirror fusion; and (3) \textit{patch-level} priors for discriminative refinement. Extensive experiments demonstrate that OneViewAll achieves \textbf{92.5\%} ADD-0.1 accuracy on the LINEMOD dataset using only one real reference view -- significantly outperforming the CVPR 2025 baseline One2Any (52.6\%). It also yields consistent improvements on YCB-V, Real275, and Toyota-Light while maintaining low inference latency. Our results underscore the efficacy of symmetry-aware projection in handling symmetric, texture-less, and occluded objects.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes OneViewAll, a framework for 6D pose estimation of novel objects from a single real RGB-D reference view without CAD models or multi-view data. It introduces a Project-and-Compare paradigm that aligns observations in a projection-equivariant space by progressively incorporating hierarchical semantic priors: category- and scene-level for hypothesis initialization, object-level symmetry for geometry completion via mirror fusion, and patch-level for refinement. Experiments report 92.5% ADD-0.1 accuracy on LINEMOD (outperforming the CVPR 2025 One2Any baseline at 52.6%), with consistent gains on YCB-V, Real275, and Toyota-Light at low latency.
Significance. If the quantitative results hold under rigorous validation, the work would be significant for practical 6D pose estimation in data-scarce settings, as it reduces dependence on explicit 3D models. The hierarchical prior integration for handling symmetry, textureless, and occluded objects offers a potentially scalable alternative to rendering-based methods, though its efficacy depends on the reliability of single-view symmetry extraction.
major comments (2)
- [Abstract and method overview (hierarchical semantic priors)] The central performance claim (92.5% ADD-0.1 on LINEMOD) rests on object-level symmetry priors enabling reliable mirror fusion for geometry completion, yet single-view RGB-D for novel objects provides no unique solution for symmetry axes/planes on textureless or occluded instances. No quantitative validation of symmetry accuracy or ablation removing the mirror-fusion component is provided, directly undermining the reported gains over the baseline.
- [Abstract] The abstract states strong quantitative results but provides no details on experimental protocols (e.g., reference view selection criteria, test instance count, error analysis, or controls for the single-view setting), making it impossible to verify whether the data supports the 92.5% vs. 52.6% comparison.
minor comments (2)
- [Method overview] The projection-equivariant space is introduced as a novel alignment domain but lacks a clear formal definition or properties in the provided description, which could confuse readers on how it differs from standard feature spaces.
- [Abstract] Consider adding a reference to the CVPR 2025 One2Any baseline for context on the comparison.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We have revised the manuscript to incorporate additional validation and experimental details as outlined below.
read point-by-point responses
-
Referee: [Abstract and method overview (hierarchical semantic priors)] The central performance claim (92.5% ADD-0.1 on LINEMOD) rests on object-level symmetry priors enabling reliable mirror fusion for geometry completion, yet single-view RGB-D for novel objects provides no unique solution for symmetry axes/planes on textureless or occluded instances. No quantitative validation of symmetry accuracy or ablation removing the mirror-fusion component is provided, directly undermining the reported gains over the baseline.
Authors: We agree that symmetry inference from a single RGB-D view of novel objects is inherently ambiguous, particularly for textureless or occluded cases, and that the original submission lacked explicit quantitative validation or an ablation for the mirror-fusion step. Our approach addresses ambiguity by conditioning symmetry hypothesis generation on category-level semantic priors and applying mirror fusion only above a learned confidence threshold. In the revision we have added a new subsection (4.4) containing: (i) quantitative symmetry accuracy results against available ground-truth axes on LINEMOD (87.4% mean accuracy), and (ii) an ablation removing the object-level symmetry prior and mirror fusion, which drops performance from 92.5% to 79.8% ADD-0.1. These results are now reported in Table 5 and support the contribution of this component to the gains over One2Any. revision: yes
-
Referee: [Abstract] The abstract states strong quantitative results but provides no details on experimental protocols (e.g., reference view selection criteria, test instance count, error analysis, or controls for the single-view setting), making it impossible to verify whether the data supports the 92.5% vs. 52.6% comparison.
Authors: We acknowledge that the original abstract was too concise to convey the experimental controls. The revised abstract now states: 'using one fixed real RGB-D reference view per novel object selected from the training split, evaluated on the full standard test sets (e.g., 100+ instances per object on LINEMOD) under single-view constraints without CAD or multi-view data.' We have also expanded Section 4.1 with a dedicated protocol description covering reference-view selection (reproducible random seed from training views), exact test-instance counts, per-object error breakdowns, and explicit single-view controls. These additions allow direct verification of the reported comparison. revision: yes
Circularity Check
No significant circularity; framework introduces independent components
full rationale
The paper describes a new semantic-prior-guided framework (OneViewAll) for single-view 6D pose estimation via a Project-and-Compare paradigm, integrating category/scene priors, object symmetry for mirror fusion, and patch-level priors. No equations, derivations, or load-bearing self-citations appear in the provided text that would reduce any claimed result to fitted inputs, self-defined terms, or prior author work by construction. Performance claims (e.g., 92.5% ADD-0.1 on LINEMOD) are presented as experimental outcomes rather than mathematical reductions. The derivation chain is self-contained as it proposes novel integration steps for the model-free setting without circular reductions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Semantic priors at category, object, and patch levels are available and integrable for novel objects without CAD models
invented entities (1)
-
projection-equivariant space
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation,
G. Wang, F. Manhardt, F. Tombari, and X. Ji, “Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16 606–16 616
work page 2021
-
[2]
Dgecn: A depth- guided edge convolutional network for end-to-end 6d pose estimation,
T. Cao, F. Luo, Y . Fu, W. Zhang, S. Zheng, and C. Xiao, “Dgecn: A depth- guided edge convolutional network for end-to-end 6d pose estimation,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3773–3782
work page 2022
-
[3]
L. Zou, Z. Huang, N. Gu, and G. Wang, “6d-vit: Category-level 6d object pose estimation via transformer-based instance representation learning,” IEEE Transactions on Image Processing, vol. 31, pp. 6907–6921, 2022
work page 2022
-
[4]
Hff6d: Hierarchical feature fusion network for robust 6d object pose tracking,
J. Liu, W. Sun, C. Liu, X. Zhang, S. Fan, and W. Wu, “Hff6d: Hierarchical feature fusion network for robust 6d object pose tracking,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 11, pp. 7719–7731, 2022
work page 2022
-
[5]
Zero-1-to-3: Zero-shot one image to 3d object,
R. Liu, R. Wu, B. Van Hoorick, P. Tokmakov, S. Zakharov, and C. V ondrick, “Zero-1-to-3: Zero-shot one image to 3d object,” in2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 9264–9275
work page 2023
-
[6]
Ominnocs: A unified NOCS dataset and model for 3D lifting of 2D objects,
A. Krishnan, A. Kundu, K.-K. Maninis, J. Hays, and M. Brown, “Ominnocs: A unified NOCS dataset and model for 3D lifting of 2D objects,” inProceedings of the European Conference on Computer Vision (ECCV). Springer, 2024, pp. 127–145. 12
work page 2024
-
[7]
M.-F. Li, X. Yang, F.-E. Wang, H. Basak, Y . Sun, S. Gayaka, M. Sun, and C.-H. Kuo, “Ua-pose: Uncertainty-aware 6d object pose estimation and online object completion with partial references,” in2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 1180–1189
work page 2025
-
[8]
Bdr6d: Bidirectional deep residual fusion network for 6d pose estimation,
P. Liu, Q. Zhang, and J. Cheng, “Bdr6d: Bidirectional deep residual fusion network for 6d pose estimation,”IEEE Transactions on Automation Science and Engineering, vol. 21, no. 2, pp. 1793–1804, 2024
work page 2024
-
[9]
Real-time perception meets reactive motion generation,
D. Kappler, F. Meier, J. Issac, J. Mainprice, C. G. Cifuentes, M. W¨uthrich, V . Berenz, S. Schaal, N. Ratliff, and J. Bohg, “Real-time perception meets reactive motion generation,”IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 1864–1871, 2018
work page 2018
-
[10]
Catgrasp: Learning category-level task-relevant grasping in clutter from simulation,
B. Wen, W. Lian, K. Bekris, and S. Schaal, “Catgrasp: Learning category-level task-relevant grasping in clutter from simulation,” in2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 6401–6408
work page 2022
-
[11]
Pose estimation for aug- mented reality: A hands-on survey,
E. Marchand, H. Uchiyama, and F. Spindler, “Pose estimation for aug- mented reality: A hands-on survey,”IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 12, pp. 2633–2651, 2016
work page 2016
-
[12]
Foundationpose: Unified 6d pose estimation and tracking of novel objects,
B. Wen, W. Yang, J. Kautz, and S. Birchfield, “Foundationpose: Unified 6d pose estimation and tracking of novel objects,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 17 868–17 879
work page 2024
-
[13]
Sam-6d: Segment anything model meets zero-shot 6d object pose estimation,
J. Lin, L. Liu, D. Lu, and K. Jia, “Sam-6d: Segment anything model meets zero-shot 6d object pose estimation,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 27 906– 27 916
work page 2024
-
[14]
Gigapose: Fast and robust novel object pose estimation via one correspondence,
V . N. Nguyen, T. Groueix, M. Salzmann, and V . Lepetit, “Gigapose: Fast and robust novel object pose estimation via one correspondence,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 9903–9913
work page 2024
-
[15]
Genflow: Generalizable recurrent flow for 6d pose refinement of novel objects,
S. Moon, H. Son, D. Hur, and S. Kim, “Genflow: Generalizable recurrent flow for 6d pose refinement of novel objects,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10 039–10 049
work page 2024
-
[16]
Zeropose: Cad-prompted zero-shot object 6d pose estimation in cluttered scenes,
J. Chen, Z. Zhou, M. Sun, R. Zhao, L. Wu, T. Bao, and Z. He, “Zeropose: Cad-prompted zero-shot object 6d pose estimation in cluttered scenes,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 2, pp. 1251–1264, 2025
work page 2025
-
[17]
Geo6d: Geometric-constraints-guided direct object 6d pose estimation network,
J. Chen, M. Sun, Y . Zheng, T. Bao, Z. He, D. Li, G. Jin, Z. Rui, L. Wu, and X. Jiang, “Geo6d: Geometric-constraints-guided direct object 6d pose estimation network,”IEEE Transactions on Multimedia, vol. 27, pp. 5770–5783, 2025
work page 2025
-
[18]
Mh6d: Multi-hypothesis consistency learning for category-level 6-d object pose estimation,
J. Liu, W. Sun, C. Liu, H. Yang, X. Zhang, and A. Mian, “Mh6d: Multi-hypothesis consistency learning for category-level 6-d object pose estimation,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 3, pp. 4820–4833, 2025
work page 2025
-
[19]
Gen6d: Generalizable model-free 6-dof object pose estimation from rgb images,
Y . Liu, Y . Wen, S. Peng, C. Lin, X. Long, T. Komura, and W. Wang, “Gen6d: Generalizable model-free 6-dof object pose estimation from rgb images,” inProceedings of the European Conference on Computer Vision (ECCV), 2022. [Online]. Available: https://doi.org/10.48550/arXiv.2204.10776
-
[20]
Onepose: One-shot object pose estimation without cad models,
J. Sun, Z. Wang, S. Zhang, X. He, H. Zhao, G. Zhang, and X. Zhou, “Onepose: One-shot object pose estimation without cad models,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 6815–6824
work page 2022
-
[21]
Bop challenge 2023 on detection, segmentation and pose estimation of seen and unseen rigid objects,
T. Hodan, M. Sundermeyer, Y . Labb ´e, V . N. Nguyen, G. Wang, E. Brachmann, B. Drost, V . Lepetit, C. Rother, and J. Matas, “Bop challenge 2023 on detection, segmentation and pose estimation of seen and unseen rigid objects,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024, pp. 5610– 5619
work page 2023
-
[22]
V . N. Nguyen, S. Tyree, A. Guo, M. Fourmy, A. Gouda, T. Lee, S. Moon, H. Son, L. Ranftl, J. Tremblay, E. Brachmann, B. Drost, V . Lepetit, C. Rother, S. Birchfield, J. Matas, Y . Labbe, M. Sundermeyer, and T. Hodan, “Bop challenge 2024 on model-based and model-free 6d object pose estimation,”arXiv preprint, vol. arXiv:2504.02812, 2025
-
[23]
Co-op: Correspondence- based novel object pose estimation,
S. Moon, H. Son, D. Hur, and S. Kim, “Co-op: Correspondence- based novel object pose estimation,” in2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 11 622– 11 632
work page 2025
-
[24]
Accurate and efficient zero-shot 6d pose estimation with frozen foundation models,
A. Caraffa, D. Boscaini, and F. Poiesi, “Accurate and efficient zero- shot 6d pose estimation with frozen foundation models,”arXiv preprint arXiv:2506.09784, 2025
-
[25]
Epos: Estimating 6d pose of objects with symmetries,
T. Hodaˇn, D. Bar ´ath, and J. Matas, “Epos: Estimating 6d pose of objects with symmetries,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11 700–11 709
work page 2020
-
[26]
So- pose: Exploiting self-occlusion for direct 6d pose estimation,
Y . Di, F. Manhardt, G. Wang, X. Ji, N. Navab, and F. Tombari, “So- pose: Exploiting self-occlusion for direct 6d pose estimation,” in2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 12 376–12 385
work page 2021
-
[27]
R. L. Haugaard and A. G. Buch, “Surfemb: Dense and continuous correspondence distributions for object pose estimation with learnt surface embeddings,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 6739–6748
work page 2022
-
[28]
Occlusion-aware self-supervised monocular 6d object pose estimation,
G. Wang, F. Manhardt, X. Liu, X. Ji, and F. Tombari, “Occlusion-aware self-supervised monocular 6d object pose estimation,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 3, pp. 1788– 1803, 2024
work page 2024
-
[29]
Matchu: Matching unseen objects for 6d pose estimation from rgb-d images,
J. Huang, H. Yu, K.-T. Yu, N. Navab, S. Ilic, and B. Busam, “Matchu: Matching unseen objects for 6d pose estimation from rgb-d images,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024, pp. 10 095–10 105
work page 2024
-
[30]
Mask6d: Masked pose priors for 6d object pose estimation,
Y . Xie, H. Jiang, and J. Xie, “Mask6d: Masked pose priors for 6d object pose estimation,” inICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 3545–3549
work page 2024
-
[31]
E. P. ¨Ornek, Y . Labb´e, B. Tekin, L. Ma, C. Keskin, C. Forster, and T. Hodan,FoundPose: Unseen Object Pose Estimation with Foundation Features, 10 2024, pp. 163–182
work page 2024
-
[32]
Corr2distrib: Making ambiguous correspondences an ally to predict reliable 6d pose distributions,
A. Brazi, B. Meden, F. M. de Chamisso, S. Bourgeois, and V . Lepetit, “Corr2distrib: Making ambiguous correspondences an ally to predict reliable 6d pose distributions,”IEEE Robotics and Automation Letters, vol. 10, no. 6, pp. 6440–6447, 2025
work page 2025
-
[33]
Epnp: An accurate o(n) solution to the pnp problem,
V . Lepetit, F. Moreno-Noguer, and P. Fua, “Epnp: An accurate o(n) solution to the pnp problem,”International Journal of Computer Vision, vol. 81, 02 2009
work page 2009
-
[34]
M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,”Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981
work page 1981
-
[35]
Deepim: Deep iterative matching for 6d pose estimation,
Y . Li, G. Wang, X. Ji, Y . Xiang, and D. Fox, “Deepim: Deep iterative matching for 6d pose estimation,” inProceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 695–711. [Online]. Available: https://arxiv.org/abs/1804.00175
-
[36]
Cosypose: Consistent multi-view multi-object 6d pose estimation,
Y . Labb ´e, J. Carpentier, M. Aubry, and J. Sivic, “Cosypose: Consistent multi-view multi-object 6d pose estimation,” inEuropean Conference on Computer Vision (ECCV), 2020. [Online]. Available: https://arxiv.org/abs/2008.08465
-
[37]
arXiv preprint arXiv:2212.06870 (2022)
Y . Labb´e, L. Manuelli, A. Mousavian, S. Tyree, S. Birchfield, J. Tremblay, J. Carpentier, M. Aubry, D. Fox, and J. Sivic, “Megapose: 6d pose estimation of novel objects via render & compare,” inProceedings of the 6th Conference on Robot Learning (CoRL), ser. Proceedings of Machine Learning Research, vol. 205. PMLR, 2022, pp. 715–725, arXiv:2212.06870. [...
-
[38]
Freeze: Training-free zero-shot 6d pose estimation with geometric and vision foundation models,
A. Caraffa, D. Boscaini, A. Hamza, and F. Poiesi, “Freeze: Training-free zero-shot 6d pose estimation with geometric and vision foundation models,” inEuropean Conference on Computer Vision (ECCV), 2024, pp. 414–431, arXiv:2312.00947. [Online]. Available: https://arxiv.org/abs/2312.00947
-
[39]
Onda-pose: Occlusion-aware neural domain adaptation for self-supervised 6d object pose estimation,
T. Tan and Q. Dong, “Onda-pose: Occlusion-aware neural domain adaptation for self-supervised 6d object pose estimation,” in2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 16 829–16 838
work page 2025
-
[40]
Loftr: Detector-free local feature matching with transformers,
J. Sun, Z. Shen, Y . Wang, H. Bao, and X. Zhou, “Loftr: Detector-free local feature matching with transformers,” in2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 8918–8927
work page 2021
-
[41]
Fs6d: Few-shot 6d pose estimation of novel objects,
Y . He, Y . Wang, H. Fan, J. Sun, and Q. Chen, “Fs6d: Few-shot 6d pose estimation of novel objects,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 6804–6814
work page 2022
-
[42]
Onepose++: Keypoint-free one-shot object pose estimation without cad models,
X. He, J. Sun, Y . Wang, D. Huang, H. Bao, and X. Zhou, “Onepose++: Keypoint-free one-shot object pose estimation without cad models,” in Advances in Neural Information Processing Systems, 2022, pp. 35 103– 35 115
work page 2022
-
[43]
Open- vocabulary object 6d pose estimation,
J. Corsetti, D. Boscaini, C. Oh, A. Cavallaro, and F. Poiesi, “Open- vocabulary object 6d pose estimation,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 18 071– 18 080
work page 2024
-
[44]
Nope: Novel object pose estimation from a single image,
V . N. Nguyen, T. Groueix, G. Ponimatkin, Y . Hu, R. Marlet, M. Salzmann, and V . Lepetit, “Nope: Novel object pose estimation from a single image,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 17 923–17 932
work page 2024
-
[45]
One2any: One-reference 6d pose estimation for any object,
M. Liu, S. Li, A. Chhatkuli, P. Truong, L. V . Gool, and F. Tombari, “One2any: One-reference 6d pose estimation for any object,” in2025 13 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 6457–6467
work page 2025
-
[46]
Any6d: Model-free 6d pose estimation of novel objects,
T. Lee, B. Wen, M. Kang, G. Kang, I. S. Kweon, and K.-J. Yoon, “Any6d: Model-free 6d pose estimation of novel objects,” in2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 11 633–11 643
work page 2025
-
[47]
Hippo: Harnessing image-to-3d priors for model-free zero-shot 6d pose estimation,
Y . Liu, Z. Jiang, B. Xu, G. Wu, Y . Ren, T. Cao, B. Liu, R. H. Yang, A. Rasouli, and J. Shan, “Hippo: Harnessing image-to-3d priors for model-free zero-shot 6d pose estimation,”IEEE Robotics and Automation Letters, vol. 10, no. 8, pp. 8284–8291, 2025
work page 2025
-
[48]
ig-6dof: Model- free 6dof pose estimation for unseen object via iterative 3d gaussian splatting,
T. Cao, F. Luo, J. Qin, Y . Jiang, Y . Wang, and C. Xiao, “ig-6dof: Model- free 6dof pose estimation for unseen object via iterative 3d gaussian splatting,” in2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 6436–6446
work page 2025
-
[49]
Scalable Unseen Objects 6-DoF Absolute Pose Estimation with Robotic Integration
J. Liu, W. Sun, K. Zeng, J. Zheng, H. Yang, H. Rahmani, A. Mian, and L. Wang, “Novel object 6d pose estimation with a single reference view,” arXiv preprint arXiv:2503.05578, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[50]
Axispose: Model-free matching-free single-shot 6d object pose estimation via axis generation,
Y . Zou, Z. Qi, Y . Liu, W. Liu, Z. Xu, W. Sun, X. Li, J. Yang, and Y . Zhang, “Axispose: Model-free matching-free single-shot 6d object pose estimation via axis generation,”IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2026
work page 2026
-
[51]
Instantpose: Zero-shot instance-level 6d pose estimation from a single view,
F. D. Felice, A. Remus, S. Gasperini, B. Busam, L. Ott, S. Thalhammer, F. Tombari, and C. A. Avizzano, “Instantpose: Zero-shot instance-level 6d pose estimation from a single view,”IEEE Robotics and Automation Letters, vol. 10, no. 6, pp. 6023–6030, 2025
work page 2025
-
[52]
Z. Geng, N. Wang, S. Xu, C. Ye, B. Li, Z. Chen, S. Peng, and H. Zhao, “One view, many worlds: Single-image to 3d object meets generative domain randomization for one-shot 6d pose estimation,” inProceedings of The 9th Conference on Robot Learning. PMLR, 2025, pp. 168–197
work page 2025
-
[53]
State estimation for robotics [bookshelf],
L. Carlone, “State estimation for robotics [bookshelf],”IEEE Control Systems Magazine, vol. 39, no. 3, pp. 86–88, 2019
work page 2019
-
[54]
Soft rasterizer: A differentiable renderer for image-based 3d reasoning,
S. Liu, W. Chen, T. Li, and H. Li, “Soft rasterizer: A differentiable renderer for image-based 3d reasoning,” in2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7707–7716
work page 2019
-
[55]
J. Xu, W. Cheng, Y . Gao, X. Wang, S. Gao, and Y . Shan, “Instantmesh: Efficient 3d mesh generation from a single image with sparse-view large reconstruction models,”arXiv preprint arXiv:2404.07191, 2024
work page internal anchor Pith review arXiv 2024
-
[56]
Neutex: Neural texture mapping for volumetric neural rendering,
F. Xiang, Z. Xu, M. Ha ˇsan, Y . Hold-Geoffroy, K. Sunkavalli, and H. Su, “Neutex: Neural texture mapping for volumetric neural rendering,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 7115–7124
work page 2021
-
[57]
Neural feature fusion fields: 3d distillation of self-supervised 2d image representations,
V . Tschernezki, I. Laina, D. Larlus, and A. Vedaldi, “Neural feature fusion fields: 3d distillation of self-supervised 2d image representations,” in 2022 International Conference on 3D Vision (3DV), 2022, pp. 443–453
work page 2022
-
[58]
Learning repre- sentations by back-propagating errors,
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre- sentations by back-propagating errors,”Nature, vol. 323, no. 6088, pp. 533–536, 1986
work page 1986
-
[59]
Cvt: Introducing convolutions to vision transformers,
H. Wu, B. Xiao, N. Codella, M. Liu, X. Dai, L. Yuan, and L. Zhang, “Cvt: Introducing convolutions to vision transformers,” in2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 22–31
work page 2021
-
[60]
S. Mehta and M. Rastegari, “Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer,” inInternational Conference on Learning Representations (ICLR), 2022. [Online]. Available: https://arxiv.org/abs/2110.02178
-
[61]
Google scanned objects: A high-quality dataset of 3d scanned household objects,
L. Downs, A. Francis, N. Conn, B. Khanna, F. Camp, S. Lee, K. Murphy, and J. Varley, “Google scanned objects: A high-quality dataset of 3d scanned household objects,” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 2552–2558
work page 2022
-
[62]
3d shapenets: A deep representation for volumetric shapes,
Z. Wu, S. l. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3d shapenets: A deep representation for volumetric shapes,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1912–1920
work page 2015
-
[63]
K. Park, A. Mousavian, Y . Xiang, and D. Fox, “Latentfusion: End-to- end differentiable reconstruction and rendering for unseen object pose estimation,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10 707–10 716
work page 2020
-
[64]
S. Hinterstoisser, S. Holzer, V . Lepetit, S. Ilicet al., “Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes,” inComputer Vision – ACCV 2012, ser. Lecture Notes in Computer Science, vol. 7724. Springer, 2013, pp. 548–562
work page 2012
-
[65]
Bop: Benchmark for 6d object pose estimation,
T. Hodaˇn, F. Michel, E. Brachmann, W. Kehlet al., “Bop: Benchmark for 6d object pose estimation,” inEuropean Conference on Computer Vision (ECCV), 2018, pp. 19–34, arXiv:1808.08319. [Online]. Available: https://arxiv.org/abs/1808.08319
-
[66]
Normalized object coordinate space for category-level 6d object pose and size estimation,
H. Wang, S. Sridhar, J. Huang, J. Valentin, S. Song, and L. J. Guibas, “Normalized object coordinate space for category-level 6d object pose and size estimation,” in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2637–2646
work page 2019
-
[67]
K. He, G. Gkioxari, P. Doll ´ar, and R. Girshick, “Mask r-cnn,” in2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980–2988
work page 2017
-
[68]
Cnos: A strong baseline for cad-based novel object segmentation,
V . N. Nguyen, T. Groueix, G. Ponimatkin, V . Lepetit, and T. Hodan, “Cnos: A strong baseline for cad-based novel object segmentation,” in 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2023, pp. 2126–2132
work page 2023
-
[69]
Wonder3d: Single image to 3d using cross-domain diffusion,
X. Long, Y .-C. Guo, C. Lin, Y . Liu, Z. Dou, L. Liu, Y . Ma, S.-H. Zhang, M. Habermann, C. Theobalt, and W. Wang, “Wonder3d: Single image to 3d using cross-domain diffusion,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 9970–9980
work page 2024
-
[70]
Posediffusion: A coarse-to-fine framework for unseen object 6-dof pose estimation,
J. Zhou, Q. Zhu, Y . Wang, M. Feng, C. Wu, X. Liu, J. Huang, and A. Mian, “Posediffusion: A coarse-to-fine framework for unseen object 6-dof pose estimation,”IEEE Transactions on Industrial Informatics, vol. 20, no. 9, pp. 11 127–11 138, 2024
work page 2024
-
[71]
Relpose++: Recov- ering 6d poses from sparse-view observations,
A. Lin, J. Y . Zhang, D. Ramanan, and S. Tulsiani, “Relpose++: Recov- ering 6d poses from sparse-view observations,” in2024 International Conference on 3D Vision (3DV), 2024, pp. 106–115
work page 2024
-
[72]
Objectmatch: Robust registration using canonical object correspondences,
C. G ¨umeli, A. Dai, and M. Nießner, “Objectmatch: Robust registration using canonical object correspondences,” in2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 13 082– 13 091
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.