Radial Basis Function Networks as Projection Heads in Self-Supervised Learning
Pith reviewed 2026-06-26 14:53 UTC · model grok-4.3
The pith
Radial basis function networks can replace MLP projection heads in self-supervised learning while supplying a label-free metric for backbone quality.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In self-supervised learning pipelines, replacing the conventional MLP projection head with a radial basis function network maintains competitive downstream performance on image classification tasks. The RBFN's interpretable centers and shape parameters directly support computation of Scale-Normalized Separation, a metric that exhibits strong to very strong correlation with logistic regression metrics and therefore functions as a reliable label-free indicator of backbone representation quality.
What carries the argument
Radial basis function network (RBFN) projection head with three Gaussian-activated layers, from whose learned centers and shapes Scale-Normalized Separation is computed as a label-free quality metric.
If this is right
- RBFN heads with three Gaussian layers serve as drop-in replacements that match MLP performance in MoCo, SimCLR, BYOL, SwAV, and SimSiam.
- SNS derived solely from RBFN parameters acts as a reliable proxy for backbone quality without requiring labeled data.
- The approach avoids the computational waste of training then discarding an MLP head.
- The released Open Images V7-derived dataset enables further reproducible studies of representation quality metrics.
Where Pith is reading between the lines
- SNS could be monitored continuously during SSL training to detect representation collapse without pausing for labeled probes.
- The same RBFN construction might be tested in non-image domains such as audio or graph SSL to check whether the correlation pattern persists.
- Because the centers are explicit, one could inspect which regions of the representation space the backbone emphasizes most strongly.
Load-bearing premise
The centers and shape parameters learned by the RBFN during training capture backbone representation quality in a manner that generalizes beyond the specific training runs and datasets examined.
What would settle it
Compute SNS on RBFN heads trained with a new architecture or dataset outside the five SSL methods and four datasets tested, then measure its correlation with linear probing accuracy; a collapse to weak or negative correlation would refute the proxy claim.
Figures
read the original abstract
Self-supervised learning (SSL) typically relies on a backbone encoder followed by a small multilayer perceptron (MLP) projection head, which is conventionally discarded after training, while backbone quality is assessed via costly linear probing on labeled data. We argue that this approach including discarding the projector is rather computationally wasteful. Instead, we propose replacing the MLP head with a radial basis function network (RBFN), whose interpretable center and shape parameters can be exploited to judge representation quality without labels or a separate classifier. To this end, we introduce Scale-Normalized Separation (SNS), a novel label-free quality metric derived solely from the kernel centers and shapes learned during training. Across five canonical SSL architectures (MoCo, SimCLR, BYOL, SwAV and SimSiam) and four image classification datasets, we show that RBFN projection heads are competitive drop-in replacements for standard MLP projectors. We recommend constructing them with three RBF layers activated by the Gaussian radial basis function. Moreover, SNS exhibits strong to very strong positive correlation with established logistic regression metrics, demonstrating that a trained RBFN projector can act as a reliable proxy for backbone representation quality. We additionally publish a novel PyTorch compatible image classification dataset based on Google's Open Images V7 to facilitate reproducible research into representation learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes replacing standard MLP projection heads with radial basis function networks (RBFNs) in self-supervised learning frameworks (MoCo, SimCLR, BYOL, SwAV, SimSiam). It introduces Scale-Normalized Separation (SNS), a label-free metric derived from the learned RBF centers and shape parameters, as a proxy for backbone representation quality. Empirical results across four image classification datasets claim that RBFN heads (recommended with three layers and Gaussian activation) are competitive drop-in replacements for MLPs, with SNS exhibiting strong to very strong positive correlations to logistic regression probing accuracies. The work also releases a new PyTorch-compatible image classification dataset derived from Open Images V7.
Significance. If the empirical claims hold under rigorous validation, the work could reduce computational waste in SSL pipelines by retaining and exploiting the projection head for evaluation rather than discarding it, while providing an interpretable, label-free alternative to linear probing. The SNS metric and the released dataset represent concrete contributions to reproducibility and efficiency in representation learning.
major comments (2)
- [Abstract / Experimental results] Abstract and experimental results: the central claim of competitive performance across five SSL architectures and four datasets, plus strong SNS-logistic regression correlations, is reported without details on training hyperparameters, number of runs, error bars, statistical significance tests, or controls for post-hoc selection. This leaves the empirical support for the drop-in replacement and proxy-metric claims only moderately substantiated.
- [SNS metric definition] SNS definition and evaluation: SNS is computed directly from the RBF centers and shape parameters that are optimized as part of the same SSL training objective, creating dependence on the learned parameters. The manuscript presents SNS as an independent proxy, but additional validation (e.g., sensitivity analysis or comparison to external benchmarks) is needed to confirm it generalizes beyond the specific training runs.
minor comments (1)
- [Method / Recommendations] The recommendation to use three RBF layers with Gaussian activation should be accompanied by an ablation study or justification if not already included, to clarify sensitivity to these architectural choices.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below, clarifying our position and outlining revisions where appropriate to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / Experimental results] Abstract and experimental results: the central claim of competitive performance across five SSL architectures and four datasets, plus strong SNS-logistic regression correlations, is reported without details on training hyperparameters, number of runs, error bars, statistical significance tests, or controls for post-hoc selection. This leaves the empirical support for the drop-in replacement and proxy-metric claims only moderately substantiated.
Authors: We agree that expanded reporting of experimental details will improve rigor. In the revised manuscript we will add full hyperparameter tables, specify the number of independent runs, include error bars or standard deviations, report statistical significance where relevant, and discuss controls for post-hoc selection. These additions will provide stronger substantiation while preserving the existing empirical trends across architectures and datasets. revision: yes
-
Referee: [SNS metric definition] SNS definition and evaluation: SNS is computed directly from the RBF centers and shape parameters that are optimized as part of the same SSL training objective, creating dependence on the learned parameters. The manuscript presents SNS as an independent proxy, but additional validation (e.g., sensitivity analysis or comparison to external benchmarks) is needed to confirm it generalizes beyond the specific training runs.
Authors: SNS is explicitly derived from the learned RBF parameters, which is the intended mechanism for obtaining a label-free signal; we will revise the text to avoid any implication of full independence from the training process. To address the request for further validation, we will incorporate a sensitivity analysis (perturbations to centers and shapes) and comparisons against additional external benchmarks in the revision. revision: partial
Circularity Check
No significant circularity
full rationale
The paper's central claims are empirical: RBFN heads are shown competitive with MLP heads across five SSL methods and four datasets via standard training and evaluation protocols, and SNS is introduced as a metric computed from the learned RBF parameters with its utility demonstrated by observed correlations to logistic regression accuracy. No derivation reduces a result to its inputs by construction, no self-citation chain supports a uniqueness claim, and no fitted parameter is relabeled as an independent prediction. The SNS definition is explicit about its dependence on training outputs, and the correlation evidence is external to that definition.
Axiom & Free-Parameter Ledger
free parameters (2)
- number of RBF layers
- RBF shape and center parameters
axioms (1)
- domain assumption RBFN can be substituted for MLP projection heads without materially changing the learned backbone representations under standard SSL objectives.
invented entities (1)
-
Scale-Normalized Separation (SNS)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Agrawal, K.K., Mondal, A.K., Ghosh, A., Richards, B.A.:α-ReQ: Assessing rep- resentation quality by measuring eigenspectrum decay. In: Proc. International Conference on Neural Information Processing Systems. NIPS ’22, Curran Asso- ciates Inc., Red Hook, NY, USA (2022), https://dl.acm.org/doi/10.5555/3600270. 3601551
-
[2]
Amirian, M., Schwenker, F.: Radial Basis Function Networks for Convolutional Neural Networks to Learn Similarity Distance Metric and Improve Interpretabil- ity.IEEEAccess8,123087–123097(2020).https://doi.org/10.1109/ACCESS.2020. 3007337
-
[3]
Bardes, A., Ponce, J., LeCun, Y.: VICReg: Variance-Invariance-Covariance Reg- ularization for Self-Supervised Learning. In: Proc. International Conference on Learning Representations, ICLR. OpenReview.net (2022). https://doi.org/10. 48550/arXiv.2105.04906
Pith/arXiv arXiv 2022
-
[4]
In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S
Ben-Shaul,I.,Shwartz-Ziv,R.,Galanti,T.,Dekel,S.,LeCun,Y.:ReverseEngineer- ing Self-Supervised Learning. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Proc. International Conference on Neural Information Processing Systems (NeurIPS) (2023). https://doi.org/10.48550/arXiv.2305.15614
-
[5]
Transactions on Machine Learning Research2023(2023)
Bordes, F., Balestriero, R., Garrido, Q., Bardes, A., Vincent, P.: Guillotine Reg- ularization: Why removing layers is needed to improve generalization in Self- Supervised Learning. Transactions on Machine Learning Research2023(2023). https://doi.org/10.48550/arXiv.2206.13378
-
[6]
In: Advances in Neural Informa- tion Processing Systems
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature Verification using a ”Siamese” Time Delay Neural Network. In: Advances in Neural Informa- tion Processing Systems. vol. 6 (1993), https://dl.acm.org/doi/10.5555/2987189. 2987282
-
[7]
Complex Systems2(3) (1988), https://www.complex-systems.com/ abstracts/v02_i03_a05/, accessed: 2026-06-02
Broomhead, D.S., Lowe, D.: Multivariable Functional Interpolation and Adap- tive Networks. Complex Systems2(3) (1988), https://www.complex-systems.com/ abstracts/v02_i03_a05/, accessed: 2026-06-02
1988
-
[8]
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsuper- vised Learning of Visual Features by Contrasting Cluster Assignments. In: Proc. International Conference on Neural Information Processing Systems. NIPS ’20, Curran Associates Inc., Red Hook, NY, USA (2020), https://dl.acm.org/doi/10. 5555/3495724.3496555
arXiv 2020
-
[9]
2021 IEEE/CVF International Conference on Computer Vision (ICCV) , author=
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging Properties in Self-Supervised Vision Transformers. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Mon- treal, QC, Canada, October 10-17, 2021. pp. 9630–9640. IEEE (2021). https: //doi.org/10.1109/ICCV48922.2021.00951
-
[10]
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A Simple Framework for Con- trastive Learning of Visual Representations. In: Proceedings of the 37th Inter- national Conference on Machine Learning. ICML’20, JMLR.org (2020), https: //dl.acm.org/doi/10.5555/3524938.3525087
-
[11]
Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.: Big Self-Supervised Models are Strong Semi-Supervised Learners. In: Proc. International Conference on Neural Information Processing Systems. NIPS ’20, Curran Associates Inc., Red Hook, NY, USA (2020), https://dl.acm.org/doi/10.5555/3495724.3497589
-
[12]
Chen, X., Fan, H., Girshick, R.B., He, K.: Improved Baselines with Momentum Contrastive Learning. CoRRabs/2003.04297(2020). https://doi.org/10.48550/ arXiv.2003.04297 18 A. Schliebitz et al
Pith/arXiv arXiv 2003
-
[13]
SurFree: a fast surrogate-free black-box attack,
Chen, X., He, K.: Exploring Simple Siamese Representation Learning. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 15745–15753 (2021). https://doi.org/10.1109/CVPR46437.2021.01549
-
[14]
Chun-Hsiao Yeh, Y.C.: IN100pytorch: PyTorch Implementation: Training ResNets on ImageNet-100 (2022), https://github.com/danielchyeh/ImageNet-100-Pytorch, accessed: 2026-06-02
2022
-
[15]
Cubuk, E.D., Zoph, B., Mané, D., Vasudevan, V., Le, Q.V.: AutoAugment: Learn- ing Augmentation Strategies From Data. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 113–123 (2019). https: //doi.org/10.1109/CVPR.2019.00020
-
[16]
Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: Practical automated dataaugmentationwithareducedsearchspace.In:2020IEEE/CVFConferenceon Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 3008–3017 (2020). https://doi.org/10.1109/CVPRW50498.2020.00359
-
[17]
Deep Residual Learning for Image Recognition
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large- Scale Hierarchical Image Database. In: Proc. IEEE Conference on Computer Vi- sion and Pattern Recognition. pp. 248–255 (2009). https://doi.org/10.1109/CVPR. 2009.5206848
-
[18]
SurFree: a fast surrogate-free black-box attack,
Deng, W., Zheng, L.: Are Labels Always Necessary for Classifier Accuracy Eval- uation? In: Proc. IEEE Conference on Computer Vision and Pattern Recog- nition, CVPR. pp. 15069–15078. Computer Vision Foundation / IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.01482
-
[19]
Garrido, Q., Balestriero, R., Najman, L., LeCun, Y.: RankMe: Assessing the Down- stream Performance of Pretrained Self-supervised Representations by Their Rank. In: Proc. International Conference on Machine Learning. ICML’23, JMLR.org (2023), http://dl.acm.org/doi/10.5555/3618408.3618848
-
[20]
Google LLC: Open Images V7 (2022), https://storage.googleapis.com/ openimages/web/factsfigures_v7.html, accessed: 2026-06-02
2022
-
[21]
Grill, J.B., Strub, F., Altché, F., Tallec, C., Richemond, P.H., Buchatskaya, E., Doersch,C.,Pires,B.A.,Guo,Z.D.,Azar,M.G.,Piot,B.,Kavukcuoglu,K.,Munos, R., Valko, M.: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. In: Proc. International Conference on Neural Information Processing Systems. NIPS ’20, Curran Associates Inc., Red Ho...
-
[22]
Gupta, K., Ajanthan, T., van den Hengel, A., Gould, S.: Understanding and Improving the Role of Projection Head in Self-Supervised Learning. CoRR abs/2212.11491(2022). https://doi.org/10.48550/arXiv.2212.11491
-
[23]
In: IEEE/CVF Conference on Computer Vision and Pattern Recognition
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
-
[24]
Hendrycks, D., Mu, N., Cubuk, E.D., Zoph, B., Gilmer, J., Lakshminarayanan, B.: AugMix: A Simple Data Processing Method to Improve Robustness and Un- certainty. In: Proc. International Conference on Learning Representations, ICLR. OpenReview.net (2020). https://doi.org/10.48550/arXiv.1912.02781
-
[25]
IEEE Transactions on Neural Networks4(1), 156–159 (1993)
Jang, J.S., Sun, C.T.: Functional Equivalence Between Radial Basis Function Net- works and Fuzzy Inference Systems. IEEE Transactions on Neural Networks4(1), 156–159 (1993). https://doi.org/10.1109/72.182710
-
[26]
Jing, L., Vincent, P., LeCun, Y., Tian, Y.: Understanding Dimensional Collapse in Contrastive Self-supervised Learning. In: Proc. International Conference on Learn- ing Representations, ICLR. OpenReview.net (2022). https://doi.org/10.48550/ arXiv.2110.09348 RBFNs as Projection Heads in Self-Supervised Learning 19
arXiv 2022
-
[27]
Johnson, D.D., Hanchi, A.E., Maddison, C.J.: Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions. In: Proc. Interna- tional Conference on Learning Representations, ICLR. OpenReview.net (2023). https://doi.org/10.48550/arXiv.2210.01883
-
[28]
Kornblith, S., Shlens, J., Le, Q.V.: Do Better ImageNet Models Transfer Better? In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2656–2666 (2019). https://doi.org/10.1109/CVPR.2019.00277
-
[29]
Lee, J., Bahri, Y., Novak, R., Schoenholz, S.S., Pennington, J., Sohl-Dickstein, J.: Deep Neural Networks as Gaussian Processes. In: Proc. International Conference on Learning Representations, ICLR. OpenReview.net (2018). https://doi.org/10. 48550/arXiv.1711.00165
Pith/arXiv arXiv 2018
-
[30]
Li, Y., Pogodin, R., Sutherland, D.J., Gretton, A.: Self-Supervised Learning with Kernel Dependence Maximization. In: Proc. International Conference on Neural Information Processing Systems. NIPS ’21, Curran Associates Inc., Red Hook, NY, USA (2021), http://dl.acm.org/doi/10.5555/3540261.3541451
-
[31]
Lundberg, S.M., Lee, S.I.: A Unified Approach to Interpreting Model Predictions. In: Proc. International Conference on Neural Information Processing Systems. p. 4768–4777. NIPS’17, Curran Associates Inc., Red Hook, NY, USA (2017), https: //dl.acm.org/doi/10.5555/3295222.3295230
-
[32]
Ma, J., Hu, T., Wang, W.: Deciphering the Projection Head: Representation Eval- uation Self-supervised Learning. In: Proc. International Joint Conference on Arti- ficial Intelligence. IJCAI ’24 (2024). https://doi.org/10.24963/ijcai.2024/522
-
[33]
Constructive Approximation2(1), 11–22 (1986)
Micchelli, C.A.: Interpolation of Scattered Data: Distance Matrices and Condition- ally Positive Definite Functions. Constructive Approximation2(1), 11–22 (1986). https://doi.org/10.1007/BF01893414
-
[34]
Nandam, S.R., Atito, S., Feng, Z., Kittler, J., Awais, M.: Investigating Self- Supervised Methods for Label-Efficient Learning. Int. J. Comput. Vision133(7), 4522–4537 (Mar 2025). https://doi.org/10.1007/s11263-025-02397-4
-
[35]
van den Oord, A., Li, Y., Vinyals, O.: Representation Learning with Contrastive Predictive Coding. CoRRabs/1807.03748(2018). https://doi.org/10.48550/ arXiv.1807.03748
Pith/arXiv arXiv 2018
-
[36]
Orr, M.J.L.: Introduction to Radial Basis Function Networks. Tech. rep., Centre for Cognitive Science, University of Edinburgh (April 1996), https://faculty.cc.gatech. edu/~isbell/tutorials/rbf-intro.pdf, accessed: 2026-06-05
1996
-
[37]
Neural Computation3(2), 246–257 (06 1991)
Park, J., Sandberg, I.W.: Universal Approximation Using Radial-Basis-Function Networks. Neural Computation3(2), 246–257 (06 1991). https://doi.org/10.1162/ neco.1991.3.2.246
1991
-
[38]
Parulekar, A., Collins, L., Shanmugam, K., Mokhtari, A., Shakkottai, S.: InfoNCE Loss Provably Learns Cluster-Preserving Representations. In: Neu, G., Rosasco, L. (eds.) Proc. International Conference on Learning Theory. Proceedings of Machine Learning Research, vol. 195, pp. 1914–1961. PMLR (12–15 Jul 2023). https://doi. org/10.48550/arXiv.2302.07920
-
[39]
Note on regression and inheritance in the case of two parents , author =
Pearson, K.: VII. Note on Regression and Inheritance in the Case of Two Parents. Proceedings of the Royal Society of London58(347-352), 240–242 (12 1895). https: //doi.org/10.1098/rspl.1895.0041
-
[40]
Schliebitz et al
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) (2013), https: //www.image-net.org/challenges/LSVRC/2013/index, accessed: 2026-06-02 20 A. Schliebitz et al
2013
-
[41]
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV)115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
-
[42]
Russo, A.: Pytorch RBF Layer (2021), https://github.com/rssalessio/ PytorchRBFLayer, accessed: 2026-06-02
2021
-
[43]
Schliebitz, A., Tapken, H., Atzmueller, M.: The OpenImagesV7-100 Dataset (Dec 2025), https://github.com/andreas-schliebitz/open-images-v7-100, accessed: 2026-06-02
2025
-
[44]
Song, Z., Su, X., Wang, J., Qiang, W., Zheng, C., Sun, F.: Towards the Sparseness of Projection Head in Self-Supervised Learning. CoRRabs/2307.08913(2023). https://doi.org/10.48550/arXiv.2307.08913
-
[45]
The American Journal of Psychology15(1), 72–101 (1904), http://www.jstor.org/ stable/1412159
Spearman, C.: The Proof and Measurement of Association between Two Things. The American Journal of Psychology15(1), 72–101 (1904), http://www.jstor.org/ stable/1412159
arXiv 1904
-
[46]
Susmelj, I., Heller, M., Wirth, P., Prescott, J., Ebner, M., et al.: Lightly, https: //github.com/lightly-ai/lightly, accessed: 2026-06-02
2026
-
[47]
Thilak, V., Huang, C., Saremi, O., Dinh, L., Goh, H., Nakkiran, P., Susskind, J.M., Littwin, E.: LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures. In: Proc. International Conference on Learning Representations, ICLR. OpenReview.net (2024). https://doi.org/10.48550/arXiv.2312.04000
-
[48]
Tian, Y., Krishnan, D., Isola, P.: Contrastive Multiview Coding. In: Proc Euro- peanConferenceonComputerVision(ECCV).p.776–794.Springer-Verlag,Berlin, Heidelberg (2020). https://doi.org/10.1007/978-3-030-58621-8_45
-
[49]
TorchVision maintainers and contributors: TorchVision: PyTorch’s Computer Vi- sion library (Nov 2016), https://github.com/pytorch/vision, accessed: 2026-06-02
2016
-
[50]
Wang, T., Isola, P.: Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere. In: Proc. International Confer- ence on Machine Learning. ICML’20, JMLR.org (2020), https://dl.acm.org/doi/ 10.5555/3524938.3525859
-
[51]
Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep Kernel Learning. In: Gretton, A., Robert, C.C. (eds.) Proc. International Conference on Artificial In- telligence and Statistics. vol. 51, pp. 370–378. PMLR, Cadiz, Spain (09–11 May 2016). https://doi.org/10.48550/arXiv.1511.02222
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1511.02222 2016
-
[52]
Artificial Intelligence Review59(4), 112 (Feb 2026)
Wittscher, L.: A survey on design choices for self-supervised learning in computer vision. Artificial Intelligence Review59(4), 112 (Feb 2026). https://doi.org/10. 1007/s10462-026-11506-9
2026
-
[53]
Xue, Y., Gan, E., Ni, J., Joshi, S., Mirzasoleiman, B.: Investigating the Benefits of Projection Head for Representation Learning. In: Proc. International Confer- ence on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net (2024). https://doi.org/10.48550/arXiv.2403.11391
-
[54]
Zadeh, P.H., Hosseini, R., Sra, S.: Deep-RBF Networks Revisited: Robust Classifi- cation with Rejection. CoRRabs/1812.03190(2018). https://doi.org/10.48550/ arXiv.1812.03190
Pith/arXiv arXiv 2018
-
[55]
Proceedings of Machine Learning Research, vol
Zbontar, J., Jing, L., Misra, I., LeCun, Y., Deny, S.: Barlow Twins: Self-Supervised LearningviaRedundancyReduction.In:Meila,M.,Zhang,T.(eds.)Proceedingsof the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 12310–12320. PMLR (18–24 Jul 2021). https: //doi.org/10.48550/arXiv.2103.03230
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.