Statistical mechanics of sparse generalization and model selection

Alejandro Lage-Castellanos; Andrea Pagnani; Martin Weigt

arxiv: 0907.3241 · v1 · pith:VU4EKYV6new · submitted 2009-07-18 · ❄️ cond-mat.dis-nn · cond-mat.stat-mech· physics.data-an

Statistical mechanics of sparse generalization and model selection

Alejandro Lage-Castellanos , Andrea Pagnani , Martin Weigt This is my paper

classification ❄️ cond-mat.dis-nn cond-mat.stat-mechphysics.data-an

keywords dilutionmodelapproachfrequentlygeneralizationimplementlearningnaive

0 comments

read the original abstract

One of the crucial tasks in many inference problems is the extraction of sparse information out of a given number of high-dimensional measurements. In machine learning, this is frequently achieved using, as a penality term, the $L_p$ norm of the model parameters, with $p\leq 1$ for efficient dilution. Here we propose a statistical-mechanics analysis of the problem in the setting of perceptron memorization and generalization. Using a replica approach, we are able to evaluate the relative performance of naive dilution (obtained by learning without dilution, following by applying a threshold to the model parameters), $L_1$ dilution (which is frequently used in convex optimization) and $L_0$ dilution (which is optimal but computationally hard to implement). Whereas both $L_p$ diluted approaches clearly outperform the naive approach, we find a small region where $L_0$ works almost perfectly and strongly outperforms the simpler to implement $L_1$ dilution.

This paper has not been read by Pith yet.

Statistical mechanics of sparse generalization and model selection

discussion (0)