Geometric Gradient Rectification for Safe Open-Set Semi-Supervised Learning
Pith reviewed 2026-06-26 04:58 UTC · model grok-4.3
The pith
Geometric gradient rectification allows safer use of unlabeled data in open-set semi-supervised learning by preventing gradient conflicts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that projecting auxiliary gradients to be non-opposing to the supervised gradient within a rectified coordinate block, while keeping orthogonal components, leads to better performance in open-set semi-supervised learning. The method is presented as a plug-in framework that stabilizes learning when unlabeled data contains outliers.
What carries the argument
Geometric Gradient Rectification (GGR), which uses the supervised gradient as an anchor to project conflicting auxiliary gradients onto an admissible region in gradient space.
Load-bearing premise
The supervised gradient acts as a reliable anchor, and the orthogonal parts of auxiliary gradients provide beneficial signals without creating new problems after projection.
What would settle it
Running the method on a benchmark dataset and finding that it fails to improve or worsens performance on either in-distribution classification accuracy or out-of-distribution sample detection compared to the baseline would disprove the central claim.
Figures
read the original abstract
Open-set semi-supervised learning aims to leverage unlabeled data that may contain out-of-distribution outliers while maintaining performance on in-distribution classes. Existing methods mainly follow two paradigms: filtering suspicious samples or incorporating unlabeled objectives with soft weighting. We argue that both face a common trade-off: aggressive filtering can discard informative but hard ID samples, whereas utilization can introduce auxiliary gradients that conflict with supervised learning when pseudo labels are wrong. We therefore shift the focus from sample selection to gradient-level control. We propose \textit{Geometric Gradient Rectification} (GGR), a plug-in framework that uses the supervised gradient as an anchor and projects conflicting auxiliary gradients onto an admissible region in gradient space. This makes the applied auxiliary update first-order non-opposing within the rectified coordinate block while preserving orthogonal components that may still carry useful representation signals. We further extend GGR with subspace-aware rectification to stabilize the anchor under noisy mini-batch gradients. Experiments on CIFAR and ImageNet benchmarks show that GGR improves representative OSSL baselines in most settings and yields gains in both closed-set generalization and open-set robustness. Code will be available at https://github.com/JiaheChen2002/GGR.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Geometric Gradient Rectification (GGR) as a plug-in framework for open-set semi-supervised learning. It identifies trade-offs in existing filtering and utilization paradigms, then proposes projecting auxiliary gradients onto an admissible region anchored by the supervised gradient so that the applied update is first-order non-opposing within the rectified block while retaining orthogonal components. A subspace-aware extension is added to stabilize the anchor under noisy mini-batches. Experiments on CIFAR and ImageNet benchmarks are reported to show gains over representative OSSL baselines in both closed-set generalization and open-set robustness.
Significance. If the geometric projection can be shown to preserve useful representation signals without introducing new conflicts, the method would supply a gradient-space alternative to sample-level heuristics and could be adopted as a modular component in future OSSL pipelines. The plug-in design and promised code release are positive for reproducibility and adoption.
major comments (2)
- [Abstract / §3] Abstract and §3 (method description): the projection is presented at a high level without the explicit formulation of the admissible region, the definition of the coordinate block, or the precise projection operator. Without these equations it is impossible to verify that the operation is parameter-free or that orthogonal components are guaranteed to remain useful rather than noise.
- [§4] §4 (experiments): the claim that GGR improves both closed-set and open-set metrics rests on comparisons whose ablation controls for the contribution of the projection itself versus other implementation choices (e.g., subspace estimation, learning-rate scaling) are not shown; the reported gains could therefore be confounded.
minor comments (2)
- [Abstract] Notation for the supervised gradient anchor and the orthogonal complement should be introduced once and used consistently; currently the abstract mixes “anchor” and “rectified coordinate block” without a clear mapping.
- [§3.2] The subspace-aware extension is mentioned but its algorithmic complexity and additional hyperparameters are not quantified; a short complexity table or pseudocode line would clarify the overhead.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below and will incorporate the suggested clarifications and controls in the revised version.
read point-by-point responses
-
Referee: [Abstract / §3] Abstract and §3 (method description): the projection is presented at a high level without the explicit formulation of the admissible region, the definition of the coordinate block, or the precise projection operator. Without these equations it is impossible to verify that the operation is parameter-free or that orthogonal components are guaranteed to remain useful rather than noise.
Authors: We agree that the current description would benefit from explicit equations. In the revision we will add the precise definitions: the admissible region is the half-space {g_aux | g_sup · g_aux ≥ 0} restricted to the coordinate block of auxiliary parameters; the coordinate block is the subset of dimensions corresponding to the auxiliary loss; and the projection operator is the standard orthogonal projection onto this region (subtracting the component along g_sup when the dot product is negative). These additions will confirm the method is parameter-free and that orthogonal components are retained by construction of the projection. We will expand §3 with these equations and briefly reference them in the abstract. revision: yes
-
Referee: [§4] §4 (experiments): the claim that GGR improves both closed-set and open-set metrics rests on comparisons whose ablation controls for the contribution of the projection itself versus other implementation choices (e.g., subspace estimation, learning-rate scaling) are not shown; the reported gains could therefore be confounded.
Authors: We acknowledge that isolating the projection's contribution requires additional controls. The revised manuscript will include new ablation experiments that apply the rectification operator while holding subspace estimation and learning-rate scaling fixed, as well as variants with and without the subspace-aware extension. These results will be added to §4 to demonstrate that performance gains are attributable to the geometric rectification. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces Geometric Gradient Rectification (GGR) as a plug-in method that anchors on the supervised gradient and projects auxiliary gradients to avoid conflicts while retaining orthogonal components. No equations, derivations, or self-citations are shown that reduce the claimed performance gains to a fitted quantity, self-definition, or load-bearing prior result from the same authors. The central claim rests on the geometric projection construction and empirical results on CIFAR/ImageNet, which are independent of the inputs and do not collapse by construction. This is the common case of a self-contained method proposal.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The supervised gradient serves as a reliable anchor for defining an admissible region in gradient space.
Reference graph
Works this paper leans on
-
[1]
Advances in neural information processing systems27(2014)
Bachman, P., Alsharif, Q., Precup, D.: Learning with pseudo-ensembles. Advances in neural information processing systems27(2014)
2014
-
[2]
arXiv preprint arXiv:1911.09785 (2019)
Berthelot, D., Carlini, N., Cubuk, E.D., Kurakin, A., Sohn, K., Zhang, H., Raf- fel, C.: Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv:1911.09785 (2019)
-
[3]
Advances in neural information processing systems32(2019)
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: A holistic approach to semi-supervised learning. Advances in neural information processing systems32(2019)
2019
-
[4]
In: 2010 20th international conference on pattern recognition
Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: 2010 20th international conference on pattern recognition. pp. 3121–3124. IEEE (2010)
2010
-
[5]
arXiv preprint arXiv:2301.10921 (2023)
Chen,H.,Tao,R.,Fan,Y.,Wang,Y.,Wang,J.,Schiele,B.,Xie,X.,Raj,B.,Savvides, M.: Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning. arXiv preprint arXiv:2301.10921 (2023)
-
[6]
In: Proceedings of the AAAI conference on artificial intelligence
Chen, Y., Zhu, X., Li, W., Gong, S.: Semi-supervised learning under class distribu- tion mismatch. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 3569–3576 (2020)
2020
-
[7]
In: 2009 IEEE conference on computer vision and pattern recognition
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)
2009
-
[8]
In: Proceedings of the IEEE/CVF international conference on computer vision
Fan, Y., Kukleva, A., Dai, D., Schiele, B.: Ssb: Simple but strong baseline for boosting performance of open-set semi-supervised learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 16068–16078 (2023)
2023
-
[9]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)
2016
-
[10]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
He, R., Han, Z., Lu, X., Yin, Y.: Safe-student for safe deep semi-supervised learning with unseen-class unlabeled data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 14585–14594 (2022)
2022
-
[11]
In: Proceedings of the IEEE/CVF international conference on computer vision
Huang, J., Fang, C., Chen, W., Chai, Z., Wei, X., Wei, P., Lin, L., Li, G.: Trash to treasure: Harvesting ood data with cross-modal matching for open-set semi- supervised learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 8310–8319 (2021)
2021
-
[12]
In: Proceedings of the AAAI conference on artificial intelligence
Kong, H., Kim, S., Kim, H.J., Lee, S.W.: Unknown-aware graph regularization for robust semi-supervised learning from uncurated data. In: Proceedings of the AAAI conference on artificial intelligence. vol. 38, pp. 13265–13273 (2024)
2024
-
[13]
IEEE Transactions on Neural Networks and Learning Systems (2025)
Kong, H., Kim, S.J., Jung, G., Lee, S.W.: Diversify and conquer: Open-set dis- agreement for robust semi-supervised learning with outliers. IEEE Transactions on Neural Networks and Learning Systems (2025)
2025
-
[14]
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009), technical report, University of Toronto
2009
-
[15]
Temporal Ensembling for Semi-Supervised Learning
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[16]
In: Workshop on challenges in representation learning, ICML
Lee, D.H., et al.: Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML. vol. 3, p. 896. Atlanta (2013)
2013
-
[17]
In: Proceedings of the IEEE/CVF international conference on computer vision
Li, J., Xiong, C., Hoi, S.C.: Comatch: Semi-supervised learning with contrastive graph regularization. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 9475–9484 (2021) Geometric Gradient Rectification for OSSL 17
2021
-
[18]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Li, Z., Qi, L., Shi, Y., Gao, Y.: Iomatch: Simplifying open-set semi-supervised learning with joint inliers and outliers utilization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 15870–15879 (2023)
2023
-
[19]
IEEE transactions on pattern analysis and machine intelligence41(8), 1979–1993 (2018)
Miyato, T., Maeda, S.i., Koyama, M., Ishii, S.: Virtual adversarial training: a regu- larization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence41(8), 1979–1993 (2018)
1979
-
[20]
Advances in neural information processing systems31(2018)
Oliver, A., Odena, A., Raffel, C.A., Cubuk, E.D., Goodfellow, I.: Realistic evalua- tion of deep semi-supervised learning algorithms. Advances in neural information processing systems31(2018)
2018
-
[21]
In: IJCAI
Ren, Y., Feng, C., Xie, X., Zhou, S.K.: Partial optimal transport based out-of- distribution detection for open-set semi-supervised learning. In: IJCAI. pp. 4851– 4859 (2024)
2024
-
[22]
In: Proceedings of the 35th International Conference on Neural Information Processing Systems
Saito, K., Kim, D., Saenko, K.: Openmatch: open-set consistency regularization for semi-supervised learning with outliers. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. pp. 25956–25967 (2021)
2021
-
[23]
Advances in neural information processing systems33, 596–608 (2020)
Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C.A., Cubuk, E.D., Kurakin, A., Li, C.L.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems33, 596–608 (2020)
2020
-
[24]
Advances in neural information processing systems30(2017)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems30(2017)
2017
-
[25]
In: European Conference on Computer Vision
Wallin, E., Svensson, L., Kahl, F., Hammarstrand, L.: Prosub: Probabilistic open- set semi-supervised learning with subspace-based out-of-distribution detection. In: European Conference on Computer Vision. pp. 129–147. Springer (2024)
2024
-
[26]
Wang, Y., Chen, H., Fan, Y., Sun, W., Tao, R., Hou, W., Wang, R., Yang, L., Zhou, Z., Guo, L.Z., Qi, H., Wu, Z., Li, Y.F., Nakamura, S., Ye, W., Savvides, M., Raj, B., Shinozaki, T., Schiele, B., Wang, J., Xie, X., Zhang, Y.: Usb: A unified semi-supervised learning benchmark for classification. In: Thirty-sixth Conference on Neural Information Processing ...
-
[27]
arXiv preprint arXiv:2205.07246 (2022)
Wang, Y., Chen, H., Heng, Q., Hou, W., Fan, Y., Wu, Z., Wang, J., Savvides, M., Shinozaki, T., Raj, B., et al.: Freematch: Self-adaptive thresholding for semi- supervised learning. arXiv preprint arXiv:2205.07246 (2022)
-
[28]
In: European Conference on Computer Vision
Wang, Z., Xiang, L., Huang, L., Mao, J., Xiao, L., Yamasaki, T.: Scomatch: Allevi- ating overtrusting in open-set semi-supervised learning. In: European Conference on Computer Vision. pp. 217–233. Springer (2024)
2024
-
[29]
IEEE Transactions on Knowledge and Data Engineering35(9), 8934–8954 (2023)
Yang, X., Song, Z., King, I., Xu, Z.: A survey on deep semi-supervised learning. IEEE Transactions on Knowledge and Data Engineering35(9), 8934–8954 (2023). https://doi.org/10.1109/TKDE.2022.3220219
-
[30]
IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 8334–8347 (2024)
Yang, Y., Jiang, N., Xu, Y., Zhan, D.C.: Robust semi-supervised learning by wisely leveraging open-set data. IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 8334–8347 (2024)
2024
-
[31]
ACM Transactions on Knowledge Discovery from Data (TKDD)16(2), 1–27 (2021)
Yang, Y., Wei, H., Sun, Z.Q., Li, G.Y., Zhou, Y., Xiong, H., Yang, J.: S2osc: A holistic semi-supervised approach for open set classification. ACM Transactions on Knowledge Discovery from Data (TKDD)16(2), 1–27 (2021)
2021
-
[32]
In: European conference on computer vision
Yu, Q., Ikami, D., Irie, G., Aizawa, K.: Multi-task curriculum framework for open-set semi-supervised learning. In: European conference on computer vision. pp. 438–454. Springer (2020) 18 J. Chen et al
2020
-
[33]
Advances in neural information processing systems33, 5824–5836 (2020)
Yu, T., Kumar, S., Gupta, A., Levine, S., Hausman, K., Finn, C.: Gradient surgery for multi-task learning. Advances in neural information processing systems33, 5824–5836 (2020)
2020
-
[34]
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[35]
Advances in neural information processing systems34, 18408–18419 (2021)
Zhang, B., Wang, Y., Hou, W., Wu, H., Wang, J., Okumura, M., Shinozaki, T.: Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling. Advances in neural information processing systems34, 18408–18419 (2021)
2021
-
[36]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Zheng, M., You, S., Huang, L., Wang, F., Qian, C., Xu, C.: Simmatch: Semi- supervised learning with similarity matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 14471–14481 (2022) Geometric Gradient Rectification for OSSL 19 A Proofs and Additional Theoretical Analysis For clarity, throughout the appendix...
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.