{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2020:Y4U6JS2SCK7JI5UFDPZPUQJZBV","short_pith_number":"pith:Y4U6JS2S","schema_version":"1.0","canonical_sha256":"c729e4cb5212be9476851bf2fa41390d6396ba71a78601f5e49c5282ead7108e","source":{"kind":"arxiv","id":"2008.02217","version":3},"attestation_state":"computed","paper":{"title":"Hopfield Networks is All You Need","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A modern Hopfield network with continuous states has an update rule identical to the attention mechanism in transformers.","cross_cats":["cs.CL","cs.LG","stat.ML"],"primary_cat":"cs.NE","authors_text":"Bernhard Sch\\\"afl, David Kreil, Geir Kjetil Sandve, G\\\"unter Klambauer, Hubert Ramsauer, Johannes Brandstetter, Johannes Lehner, Lukas Gruber, Markus Holzleitner, Michael Kopp, Michael Widrich, Milena Pavlovi\\'c, Philipp Seidl, Sepp Hochreiter, Thomas Adler, Victor Greiff","submitted_at":"2020-07-16T17:52:37Z","abstract_excerpt":"We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. The new update rule is equivalent to the attention mechanism used in transformers."},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2008.02217","kind":"arxiv","version":3},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.NE","submitted_at":"2020-07-16T17:52:37Z","cross_cats_sorted":["cs.CL","cs.LG","stat.ML"],"title_canon_sha256":"aa4c7031940ca65cfd28ae9a8ea533d9d267004f60a6c8e30a5d154df26fd115","abstract_canon_sha256":"1ba47ba7d21390be306b536cbf58dc0b7573200d24913aacb8f369457c24a29d"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:14.916504Z","signature_b64":"30DQMrpkR+M9K4/VQpJn5Ec2+9LM7D59CbzmOeogkhi25smE139MIb5IvEaR8dg4kPx5QOKiVLt7jYKKRajkDg==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"c729e4cb5212be9476851bf2fa41390d6396ba71a78601f5e49c5282ead7108e","last_reissued_at":"2026-05-17T23:38:14.915710Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:14.915710Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Hopfield Networks is All You Need","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A modern Hopfield network with continuous states has an update rule identical to the attention mechanism in transformers.","cross_cats":["cs.CL","cs.LG","stat.ML"],"primary_cat":"cs.NE","authors_text":"Bernhard Sch\\\"afl, David Kreil, Geir Kjetil Sandve, G\\\"unter Klambauer, Hubert Ramsauer, Johannes Brandstetter, Johannes Lehner, Lukas Gruber, Markus Holzleitner, Michael Kopp, Michael Widrich, Milena Pavlovi\\'c, Philipp Seidl, Sepp Hochreiter, Thomas Adler, Victor Greiff","submitted_at":"2020-07-16T17:52:37Z","abstract_excerpt":"We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. The new update rule is equivalent to the attention mechanism used in transformers."},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"The new update rule is equivalent to the attention mechanism used in transformers. This equivalence enables a characterization of the heads of transformer models.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the continuous-state Hopfield dynamics remain stable and useful when inserted as layers inside large-scale gradient-trained networks without introducing new optimization difficulties or losing the claimed exponential capacity.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Modern Hopfield networks store exponentially many patterns, retrieve them in one update, and have an update rule equivalent to transformer attention, enabling new Hopfield layers that improve results on multiple instance learning and drug design tasks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A modern Hopfield network with continuous states has an update rule identical to the attention mechanism in transformers.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"df5a9d04f451a58029b1e9fd0662399c2464ad204daa43fb69b8a542cc7fe7e2"},"source":{"id":"2008.02217","kind":"arxiv","version":3},"verdict":{"id":"312fcfdc-3e37-41eb-8fa8-490232bda227","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-17T05:42:06.521473Z","strongest_claim":"The new update rule is equivalent to the attention mechanism used in transformers. This equivalence enables a characterization of the heads of transformer models.","one_line_summary":"Modern Hopfield networks store exponentially many patterns, retrieve them in one update, and have an update rule equivalent to transformer attention, enabling new Hopfield layers that improve results on multiple instance learning and drug design tasks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the continuous-state Hopfield dynamics remain stable and useful when inserted as layers inside large-scale gradient-trained networks without introducing new optimization difficulties or losing the claimed exponential capacity.","pith_extraction_headline":"A modern Hopfield network with continuous states has an update rule identical to the attention mechanism in transformers."},"references":{"count":300,"sample":[{"doi":"","year":null,"title":"Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics , series =","work_id":"8891cd09-c3eb-4a49-a352-fb2a2a140b86","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"van den Oord and Y","work_id":"eed854c9-3f89-4de0-83eb-7a3c3ce42be0","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"A Simple Framework for Contrastive Learning of Visual Representations , author=. CoRR , volume=","work_id":"126ebef3-deef-4d5f-a06d-39ad99864787","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Advances in Neural Information Processing Systems , pages=","work_id":"6728043f-811c-4eaf-a145-c1e9aa5983e6","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Layer normalization , author=. ArXiv , eprint=","work_id":"36e95ebd-db7d-44bd-ba96-957c6b972c28","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":300,"snapshot_sha256":"df35e7cb11a16074e15833902468eae4b86d2ec7e4dd0c82b86113fd97744a4d","internal_anchors":9},"formal_canon":{"evidence_count":3,"snapshot_sha256":"4f43c03e7ad6d31f44a8ede4cc7a0fae6baaee1ea8543a589d0d3ab30ccfd4f5"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2008.02217","created_at":"2026-05-17T23:38:14.915835+00:00"},{"alias_kind":"arxiv_version","alias_value":"2008.02217v3","created_at":"2026-05-17T23:38:14.915835+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2008.02217","created_at":"2026-05-17T23:38:14.915835+00:00"},{"alias_kind":"pith_short_12","alias_value":"Y4U6JS2SCK7J","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_16","alias_value":"Y4U6JS2SCK7JI5UF","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_8","alias_value":"Y4U6JS2S","created_at":"2026-05-18T12:33:33.725879+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":23,"internal_anchor_count":23,"sample":[{"citing_arxiv_id":"2509.04154","citing_title":"Robust Filter Attention: Self-Attention as Precision-Weighted State Estimation","ref_index":70,"is_internal_anchor":true},{"citing_arxiv_id":"2510.27258","citing_title":"Higher-order Linear Attention","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2511.01202","citing_title":"Forget BIT, It is All about TOKEN: Towards Semantic Information Theory for LLMs","ref_index":98,"is_internal_anchor":true},{"citing_arxiv_id":"2511.13053","citing_title":"Self-Organization and Spectral Mechanism of Attractor Landscapes in High-Capacity Kernel Hopfield Networks","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2509.26645","citing_title":"TTT3R: 3D Reconstruction as Test-Time Training","ref_index":60,"is_internal_anchor":true},{"citing_arxiv_id":"2512.14400","citing_title":"GRAFT: Grid-Aware Load Forecasting with Multi-Source Textual Alignment and Fusion","ref_index":27,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12836","citing_title":"Discrete Stochastic Localization for Non-autoregressive Generation","ref_index":20,"is_internal_anchor":true},{"citing_arxiv_id":"2604.02789","citing_title":"Dense Associative Memory with biased patterns: a Replica Symmetric analysis","ref_index":27,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10970","citing_title":"Context-Gated Associative Retrieval: From Theory to Transformers","ref_index":29,"is_internal_anchor":true},{"citing_arxiv_id":"2604.25481","citing_title":"Emergent Self-Attention from Astrocyte-Gated Associative Memory Dynamics","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.26082","citing_title":"How is gene-regulatory evolution affected by cell-to-cell variability?","ref_index":56,"is_internal_anchor":true},{"citing_arxiv_id":"2604.15113","citing_title":"HyperSpace: A Generalized Framework for Spatial Encoding in Hyperdimensional Representations","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2605.08143","citing_title":"HoReN: Normalized Hopfield Retrieval for Large-Scale Sequential Model Editing","ref_index":25,"is_internal_anchor":true},{"citing_arxiv_id":"2604.22992","citing_title":"Efficient Image Annotation via Semi-Supervised Object Segmentation with Label Propagation","ref_index":17,"is_internal_anchor":true},{"citing_arxiv_id":"2605.05189","citing_title":"Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2604.12811","citing_title":"Algorithmic Analysis of Dense Associative Memory: Finite-Size Guarantees and Adversarial Robustness","ref_index":7,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08150","citing_title":"FlowEqProp: Training Flow Matching Generative Models with Gradient Equilibrium Propagation","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08204","citing_title":"Introducing Echo Networks for Computational Neuroevolution","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2604.05042","citing_title":"Energy-Based Dynamical Models for Neurocomputation, Learning, and Optimization","ref_index":37,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04514","citing_title":"SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2604.15113","citing_title":"HyperSpace: A Generalized Framework for Spatial Encoding in Hyperdimensional Representations","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.15121","citing_title":"SRMU: Relevance-Gated Updates for Streaming Hyperdimensional Memories","ref_index":15,"is_internal_anchor":true},{"citing_arxiv_id":"2604.16839","citing_title":"HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents","ref_index":4,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":3,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV","json":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV.json","graph_json":"https://pith.science/api/pith-number/Y4U6JS2SCK7JI5UFDPZPUQJZBV/graph.json","events_json":"https://pith.science/api/pith-number/Y4U6JS2SCK7JI5UFDPZPUQJZBV/events.json","paper":"https://pith.science/paper/Y4U6JS2S"},"agent_actions":{"view_html":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV","download_json":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV.json","view_paper":"https://pith.science/paper/Y4U6JS2S","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2008.02217&json=true","fetch_graph":"https://pith.science/api/pith-number/Y4U6JS2SCK7JI5UFDPZPUQJZBV/graph.json","fetch_events":"https://pith.science/api/pith-number/Y4U6JS2SCK7JI5UFDPZPUQJZBV/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV/action/timestamp_anchor","attest_storage":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV/action/storage_attestation","attest_author":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV/action/author_attestation","sign_citation":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV/action/citation_signature","submit_replication":"https://pith.science/pith/Y4U6JS2SCK7JI5UFDPZPUQJZBV/action/replication_record"}},"created_at":"2026-05-17T23:38:14.915835+00:00","updated_at":"2026-05-17T23:38:14.915835+00:00"}