pith. sign in

arxiv: 2307.13962 · v1 · pith:3IGG5FHGnew · submitted 2023-07-26 · 💻 cs.LG · cs.AI

Understanding Deep Neural Networks via Linear Separability of Hidden Layers

classification 💻 cs.LG cs.AI
keywords linearseparabilityhiddennetworkdeepdegreelayernetworks
0
0 comments X
read the original abstract

In this paper, we measure the linear separability of hidden layer outputs to study the characteristics of deep neural networks. In particular, we first propose Minkowski difference based linear separability measures (MD-LSMs) to evaluate the linear separability degree of two points sets. Then, we demonstrate that there is a synchronicity between the linear separability degree of hidden layer outputs and the network training performance, i.e., if the updated weights can enhance the linear separability degree of hidden layer outputs, the updated network will achieve a better training performance, and vice versa. Moreover, we study the effect of activation function and network size (including width and depth) on the linear separability of hidden layers. Finally, we conduct the numerical experiments to validate our findings on some popular deep networks including multilayer perceptron (MLP), convolutional neural network (CNN), deep belief network (DBN), ResNet, VGGNet, AlexNet, vision transformer (ViT) and GoogLeNet.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Pretraining Objective Matters in Extreme Low-Data FGVC: A Backbone-Controlled Study

    cs.CV 2026-05 unverdicted novelty 4.0

    Supervised and contrastive pretraining yield stronger linear separability than masked reconstruction or self-distillation on a three-class emerald grading task, with reconstruction improving under nonlinear probes.