{"paper":{"title":"Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Neural Low-Degree Filtering models deep learning as an explicit iterative spectral process in which each layer selects features by maximal low-degree correlation to the label.","cross_cats":["cond-mat.dis-nn","stat.ML"],"primary_cat":"cs.LG","authors_text":"Florent Krzakala, Hugo Tabanelli, Luca Arnaboldi, Matteo Vilucchio, Yatin Dandi","submitted_at":"2026-05-13T14:44:06Z","abstract_excerpt":"Understanding how deep neural networks learn useful internal representations from data remains a central open problem in the theory of deep learning. We introduce Neural Low-Degree Filtering (Neural LoFi), a stylized limit of gradient-based training in which hierarchical feature learning becomes an explicit iterative spectral procedure. In this limit, the dynamics at each layer decouple: given the current representation, the next layer selects directions with maximal accessible low-degree correlation to the label. This yields a tractable surrogate mechanism for deep learning, together with a n"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Neural LoFi provides a mathematically explicit framework for studying multi-layer feature learning beyond the lazy regime. It predicts how representations are selected layer by layer, explains how emergence of concepts arises with given sample complexity, and gives a concrete mechanism by which depth progressively constructs new features from old ones through low-degree compositionality.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The assumption that, in the stylized limit of gradient-based training, the dynamics at each layer decouple so that the next layer can independently select directions with maximal accessible low-degree correlation to the label.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Neural Low-Degree Filtering models deep learning as an explicit iterative spectral process in which each layer selects features by maximal low-degree correlation to the label.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"98eee99d1edb500da12c14d05e1dfb95a62ef266eef7b6320f1955424f3909fb"},"source":{"id":"2605.13612","kind":"arxiv","version":1},"verdict":{"id":"da530274-bb01-41b7-a323-694891d498b7","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T19:35:40.881672Z","strongest_claim":"Neural LoFi provides a mathematically explicit framework for studying multi-layer feature learning beyond the lazy regime. It predicts how representations are selected layer by layer, explains how emergence of concepts arises with given sample complexity, and gives a concrete mechanism by which depth progressively constructs new features from old ones through low-degree compositionality.","one_line_summary":"Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The assumption that, in the stylized limit of gradient-based training, the dynamics at each layer decouple so that the next layer can independently select directions with maximal accessible low-degree correlation to the label.","pith_extraction_headline":"Neural Low-Degree Filtering models deep learning as an explicit iterative spectral process in which each layer selects features by maximal low-degree correlation to the label."},"references":{"count":123,"sample":[{"doi":"","year":2015,"title":"Deep learning.nature, 521(7553):436–444, 2015","work_id":"8c42ff53-c495-4b0d-8fa1-03b2d8f9af31","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2020,"title":"The unreasonable effectiveness of deep learning in artificial intelligence","work_id":"36377418-d23c-43fe-973c-011bd3d00571","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2014,"title":"Visualizing and understanding convolutional networks","work_id":"60453a30-3d6f-49ac-a8e6-80fb37b14549","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2014,"title":"How transferable are features in deep neural networks?Advances in neural information processing systems, 27","work_id":"827b6bdf-60c2-460c-bcc4-a153256c22ad","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"The Platonic Representation Hypothesis","work_id":"950baa06-36a7-4010-a959-7304fb1ce08b","ref_index":5,"cited_arxiv_id":"2405.07987","is_internal_anchor":false}],"resolved_work":123,"snapshot_sha256":"47144f6f9da2afba6d4424581cdb272f38fae383f0e6efd7a4ea4e2735efa856","internal_anchors":1},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}