pith. sign in

arxiv: 2606.26973 · v1 · pith:WRDQNBLTnew · submitted 2026-06-25 · 💻 cs.CV · cs.LG

Geometric Gradient Rectification for Safe Open-Set Semi-Supervised Learning

Pith reviewed 2026-06-26 04:58 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords open-set semi-supervised learninggradient rectificationout-of-distributionauxiliary gradientspseudo labelsrepresentation learningCIFARImageNet
0
0 comments X

The pith

Geometric gradient rectification allows safer use of unlabeled data in open-set semi-supervised learning by preventing gradient conflicts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Open-set semi-supervised learning faces a trade-off where filtering unlabeled samples risks losing informative in-distribution examples, and using them with auxiliary objectives can introduce gradients that oppose the supervised learning when pseudo-labels are incorrect. The paper proposes to address this at the gradient level rather than the sample level. It introduces Geometric Gradient Rectification, which treats the supervised gradient as an anchor and projects auxiliary gradients into a region where they do not oppose it directly. This preserves any useful information in the orthogonal directions. Experiments on CIFAR and ImageNet show improvements over existing baselines in both closed-set accuracy and open-set robustness.

Core claim

The paper establishes that projecting auxiliary gradients to be non-opposing to the supervised gradient within a rectified coordinate block, while keeping orthogonal components, leads to better performance in open-set semi-supervised learning. The method is presented as a plug-in framework that stabilizes learning when unlabeled data contains outliers.

What carries the argument

Geometric Gradient Rectification (GGR), which uses the supervised gradient as an anchor to project conflicting auxiliary gradients onto an admissible region in gradient space.

Load-bearing premise

The supervised gradient acts as a reliable anchor, and the orthogonal parts of auxiliary gradients provide beneficial signals without creating new problems after projection.

What would settle it

Running the method on a benchmark dataset and finding that it fails to improve or worsens performance on either in-distribution classification accuracy or out-of-distribution sample detection compared to the baseline would disprove the central claim.

Figures

Figures reproduced from arXiv: 2606.26973 by Hongxia Xu, Jiahe Chen, Jian Wu, Jiaying He, Jintai Chen, Qian Shao, Qiyuan Chen.

Figure 1
Figure 1. Figure 1: Left: OSSL dilemma. A trade-off between robustness and coverage that can lead to either feature starvation or noise domination. Right: Gradient geometry. Density of cosine similarity between unlabeled gradients and the oracle supervised direction. OOD: mostly orthogonal noise; Hard ID: misclassified samples induce adversarial gradients; Rectified: GGR projects conflicting components into an anchor￾aligned … view at source ↗
Figure 2
Figure 2. Figure 2: Effect of subspace-aware rectification on CIFAR-100 across varying subspace dimensions d. We compare OSR and CSR projection variants under two label budgets. In summary, VLR is an efficient default, while subspace anchoring can provide more stable behavior and temporal smoothing. The orthogonal variant offers steadier open-set improvements across dimensions, whereas the signed variant can yield slightly hi… view at source ↗
Figure 3
Figure 3. Figure 3: Conflict diagnostics during training. Tracking the occurrence and impact of gradient conflicts between the baseline and our GGR framework. tion damage over time. The baseline accumulates substantially more regret from counterproductive updates, whereas GGR remains nearly flat. Together, these diagnostics are consistent with the view that geometric rectification suppresses harmful update directions before t… view at source ↗
read the original abstract

Open-set semi-supervised learning aims to leverage unlabeled data that may contain out-of-distribution outliers while maintaining performance on in-distribution classes. Existing methods mainly follow two paradigms: filtering suspicious samples or incorporating unlabeled objectives with soft weighting. We argue that both face a common trade-off: aggressive filtering can discard informative but hard ID samples, whereas utilization can introduce auxiliary gradients that conflict with supervised learning when pseudo labels are wrong. We therefore shift the focus from sample selection to gradient-level control. We propose \textit{Geometric Gradient Rectification} (GGR), a plug-in framework that uses the supervised gradient as an anchor and projects conflicting auxiliary gradients onto an admissible region in gradient space. This makes the applied auxiliary update first-order non-opposing within the rectified coordinate block while preserving orthogonal components that may still carry useful representation signals. We further extend GGR with subspace-aware rectification to stabilize the anchor under noisy mini-batch gradients. Experiments on CIFAR and ImageNet benchmarks show that GGR improves representative OSSL baselines in most settings and yields gains in both closed-set generalization and open-set robustness. Code will be available at https://github.com/JiaheChen2002/GGR.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Geometric Gradient Rectification (GGR) as a plug-in framework for open-set semi-supervised learning. It identifies trade-offs in existing filtering and utilization paradigms, then proposes projecting auxiliary gradients onto an admissible region anchored by the supervised gradient so that the applied update is first-order non-opposing within the rectified block while retaining orthogonal components. A subspace-aware extension is added to stabilize the anchor under noisy mini-batches. Experiments on CIFAR and ImageNet benchmarks are reported to show gains over representative OSSL baselines in both closed-set generalization and open-set robustness.

Significance. If the geometric projection can be shown to preserve useful representation signals without introducing new conflicts, the method would supply a gradient-space alternative to sample-level heuristics and could be adopted as a modular component in future OSSL pipelines. The plug-in design and promised code release are positive for reproducibility and adoption.

major comments (2)
  1. [Abstract / §3] Abstract and §3 (method description): the projection is presented at a high level without the explicit formulation of the admissible region, the definition of the coordinate block, or the precise projection operator. Without these equations it is impossible to verify that the operation is parameter-free or that orthogonal components are guaranteed to remain useful rather than noise.
  2. [§4] §4 (experiments): the claim that GGR improves both closed-set and open-set metrics rests on comparisons whose ablation controls for the contribution of the projection itself versus other implementation choices (e.g., subspace estimation, learning-rate scaling) are not shown; the reported gains could therefore be confounded.
minor comments (2)
  1. [Abstract] Notation for the supervised gradient anchor and the orthogonal complement should be introduced once and used consistently; currently the abstract mixes “anchor” and “rectified coordinate block” without a clear mapping.
  2. [§3.2] The subspace-aware extension is mentioned but its algorithmic complexity and additional hyperparameters are not quantified; a short complexity table or pseudocode line would clarify the overhead.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and will incorporate the suggested clarifications and controls in the revised version.

read point-by-point responses
  1. Referee: [Abstract / §3] Abstract and §3 (method description): the projection is presented at a high level without the explicit formulation of the admissible region, the definition of the coordinate block, or the precise projection operator. Without these equations it is impossible to verify that the operation is parameter-free or that orthogonal components are guaranteed to remain useful rather than noise.

    Authors: We agree that the current description would benefit from explicit equations. In the revision we will add the precise definitions: the admissible region is the half-space {g_aux | g_sup · g_aux ≥ 0} restricted to the coordinate block of auxiliary parameters; the coordinate block is the subset of dimensions corresponding to the auxiliary loss; and the projection operator is the standard orthogonal projection onto this region (subtracting the component along g_sup when the dot product is negative). These additions will confirm the method is parameter-free and that orthogonal components are retained by construction of the projection. We will expand §3 with these equations and briefly reference them in the abstract. revision: yes

  2. Referee: [§4] §4 (experiments): the claim that GGR improves both closed-set and open-set metrics rests on comparisons whose ablation controls for the contribution of the projection itself versus other implementation choices (e.g., subspace estimation, learning-rate scaling) are not shown; the reported gains could therefore be confounded.

    Authors: We acknowledge that isolating the projection's contribution requires additional controls. The revised manuscript will include new ablation experiments that apply the rectification operator while holding subspace estimation and learning-rate scaling fixed, as well as variants with and without the subspace-aware extension. These results will be added to §4 to demonstrate that performance gains are attributable to the geometric rectification. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces Geometric Gradient Rectification (GGR) as a plug-in method that anchors on the supervised gradient and projects auxiliary gradients to avoid conflicts while retaining orthogonal components. No equations, derivations, or self-citations are shown that reduce the claimed performance gains to a fitted quantity, self-definition, or load-bearing prior result from the same authors. The central claim rests on the geometric projection construction and empirical results on CIFAR/ImageNet, which are independent of the inputs and do not collapse by construction. This is the common case of a self-contained method proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract; the central mechanism rests on the unstated assumption that a supervised gradient anchor can be reliably identified and that projection preserves useful orthogonal components.

axioms (1)
  • domain assumption The supervised gradient serves as a reliable anchor for defining an admissible region in gradient space.
    Invoked to justify the projection step that makes auxiliary updates non-opposing.

pith-pipeline@v0.9.1-grok · 5751 in / 1200 out tokens · 32769 ms · 2026-06-26T04:58:08.264341+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    Advances in neural information processing systems27(2014)

    Bachman, P., Alsharif, Q., Precup, D.: Learning with pseudo-ensembles. Advances in neural information processing systems27(2014)

  2. [2]

    arXiv preprint arXiv:1911.09785 (2019)

    Berthelot, D., Carlini, N., Cubuk, E.D., Kurakin, A., Sohn, K., Zhang, H., Raf- fel, C.: Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv:1911.09785 (2019)

  3. [3]

    Advances in neural information processing systems32(2019)

    Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: A holistic approach to semi-supervised learning. Advances in neural information processing systems32(2019)

  4. [4]

    In: 2010 20th international conference on pattern recognition

    Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: 2010 20th international conference on pattern recognition. pp. 3121–3124. IEEE (2010)

  5. [5]

    arXiv preprint arXiv:2301.10921 (2023)

    Chen,H.,Tao,R.,Fan,Y.,Wang,Y.,Wang,J.,Schiele,B.,Xie,X.,Raj,B.,Savvides, M.: Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning. arXiv preprint arXiv:2301.10921 (2023)

  6. [6]

    In: Proceedings of the AAAI conference on artificial intelligence

    Chen, Y., Zhu, X., Li, W., Gong, S.: Semi-supervised learning under class distribu- tion mismatch. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 3569–3576 (2020)

  7. [7]

    In: 2009 IEEE conference on computer vision and pattern recognition

    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)

  8. [8]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Fan, Y., Kukleva, A., Dai, D., Schiele, B.: Ssb: Simple but strong baseline for boosting performance of open-set semi-supervised learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 16068–16078 (2023)

  9. [9]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)

  10. [10]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    He, R., Han, Z., Lu, X., Yin, Y.: Safe-student for safe deep semi-supervised learning with unseen-class unlabeled data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 14585–14594 (2022)

  11. [11]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Huang, J., Fang, C., Chen, W., Chai, Z., Wei, X., Wei, P., Lin, L., Li, G.: Trash to treasure: Harvesting ood data with cross-modal matching for open-set semi- supervised learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 8310–8319 (2021)

  12. [12]

    In: Proceedings of the AAAI conference on artificial intelligence

    Kong, H., Kim, S., Kim, H.J., Lee, S.W.: Unknown-aware graph regularization for robust semi-supervised learning from uncurated data. In: Proceedings of the AAAI conference on artificial intelligence. vol. 38, pp. 13265–13273 (2024)

  13. [13]

    IEEE Transactions on Neural Networks and Learning Systems (2025)

    Kong, H., Kim, S.J., Jung, G., Lee, S.W.: Diversify and conquer: Open-set dis- agreement for robust semi-supervised learning with outliers. IEEE Transactions on Neural Networks and Learning Systems (2025)

  14. [14]

    Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009), technical report, University of Toronto

  15. [15]

    Temporal Ensembling for Semi-Supervised Learning

    Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016)

  16. [16]

    In: Workshop on challenges in representation learning, ICML

    Lee, D.H., et al.: Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML. vol. 3, p. 896. Atlanta (2013)

  17. [17]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Li, J., Xiong, C., Hoi, S.C.: Comatch: Semi-supervised learning with contrastive graph regularization. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 9475–9484 (2021) Geometric Gradient Rectification for OSSL 17

  18. [18]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision

    Li, Z., Qi, L., Shi, Y., Gao, Y.: Iomatch: Simplifying open-set semi-supervised learning with joint inliers and outliers utilization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 15870–15879 (2023)

  19. [19]

    IEEE transactions on pattern analysis and machine intelligence41(8), 1979–1993 (2018)

    Miyato, T., Maeda, S.i., Koyama, M., Ishii, S.: Virtual adversarial training: a regu- larization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence41(8), 1979–1993 (2018)

  20. [20]

    Advances in neural information processing systems31(2018)

    Oliver, A., Odena, A., Raffel, C.A., Cubuk, E.D., Goodfellow, I.: Realistic evalua- tion of deep semi-supervised learning algorithms. Advances in neural information processing systems31(2018)

  21. [21]

    In: IJCAI

    Ren, Y., Feng, C., Xie, X., Zhou, S.K.: Partial optimal transport based out-of- distribution detection for open-set semi-supervised learning. In: IJCAI. pp. 4851– 4859 (2024)

  22. [22]

    In: Proceedings of the 35th International Conference on Neural Information Processing Systems

    Saito, K., Kim, D., Saenko, K.: Openmatch: open-set consistency regularization for semi-supervised learning with outliers. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. pp. 25956–25967 (2021)

  23. [23]

    Advances in neural information processing systems33, 596–608 (2020)

    Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C.A., Cubuk, E.D., Kurakin, A., Li, C.L.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems33, 596–608 (2020)

  24. [24]

    Advances in neural information processing systems30(2017)

    Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems30(2017)

  25. [25]

    In: European Conference on Computer Vision

    Wallin, E., Svensson, L., Kahl, F., Hammarstrand, L.: Prosub: Probabilistic open- set semi-supervised learning with subspace-based out-of-distribution detection. In: European Conference on Computer Vision. pp. 129–147. Springer (2024)

  26. [26]

    In: Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2022)

    Wang, Y., Chen, H., Fan, Y., Sun, W., Tao, R., Hou, W., Wang, R., Yang, L., Zhou, Z., Guo, L.Z., Qi, H., Wu, Z., Li, Y.F., Nakamura, S., Ye, W., Savvides, M., Raj, B., Shinozaki, T., Schiele, B., Wang, J., Xie, X., Zhang, Y.: Usb: A unified semi-supervised learning benchmark for classification. In: Thirty-sixth Conference on Neural Information Processing ...

  27. [27]

    arXiv preprint arXiv:2205.07246 (2022)

    Wang, Y., Chen, H., Heng, Q., Hou, W., Fan, Y., Wu, Z., Wang, J., Savvides, M., Shinozaki, T., Raj, B., et al.: Freematch: Self-adaptive thresholding for semi- supervised learning. arXiv preprint arXiv:2205.07246 (2022)

  28. [28]

    In: European Conference on Computer Vision

    Wang, Z., Xiang, L., Huang, L., Mao, J., Xiao, L., Yamasaki, T.: Scomatch: Allevi- ating overtrusting in open-set semi-supervised learning. In: European Conference on Computer Vision. pp. 217–233. Springer (2024)

  29. [29]

    IEEE Transactions on Knowledge and Data Engineering35(9), 8934–8954 (2023)

    Yang, X., Song, Z., King, I., Xu, Z.: A survey on deep semi-supervised learning. IEEE Transactions on Knowledge and Data Engineering35(9), 8934–8954 (2023). https://doi.org/10.1109/TKDE.2022.3220219

  30. [30]

    IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 8334–8347 (2024)

    Yang, Y., Jiang, N., Xu, Y., Zhan, D.C.: Robust semi-supervised learning by wisely leveraging open-set data. IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 8334–8347 (2024)

  31. [31]

    ACM Transactions on Knowledge Discovery from Data (TKDD)16(2), 1–27 (2021)

    Yang, Y., Wei, H., Sun, Z.Q., Li, G.Y., Zhou, Y., Xiong, H., Yang, J.: S2osc: A holistic semi-supervised approach for open set classification. ACM Transactions on Knowledge Discovery from Data (TKDD)16(2), 1–27 (2021)

  32. [32]

    In: European conference on computer vision

    Yu, Q., Ikami, D., Irie, G., Aizawa, K.: Multi-task curriculum framework for open-set semi-supervised learning. In: European conference on computer vision. pp. 438–454. Springer (2020) 18 J. Chen et al

  33. [33]

    Advances in neural information processing systems33, 5824–5836 (2020)

    Yu, T., Kumar, S., Gupta, A., Levine, S., Hausman, K., Finn, C.: Gradient surgery for multi-task learning. Advances in neural information processing systems33, 5824–5836 (2020)

  34. [34]

    Wide Residual Networks

    Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

  35. [35]

    Advances in neural information processing systems34, 18408–18419 (2021)

    Zhang, B., Wang, Y., Hou, W., Wu, H., Wang, J., Okumura, M., Shinozaki, T.: Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling. Advances in neural information processing systems34, 18408–18419 (2021)

  36. [36]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Zheng, M., You, S., Huang, L., Wang, F., Qian, C., Xu, C.: Simmatch: Semi- supervised learning with similarity matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 14471–14481 (2022) Geometric Gradient Rectification for OSSL 19 A Proofs and Additional Theoretical Analysis For clarity, throughout the appendix...