{"work":{"id":"fdd07939-8060-4e9e-bd06-ff125eae86ef","openalex_id":null,"doi":null,"arxiv_id":"2304.05969","raw_key":null,"title":"Localizing Model Behavior with Path Patching","authors":null,"authors_text":"Nicholas Goldowsky-Dill, Chris MacLeod, Lucas Sato, Aryaman Arora","year":2023,"venue":"cs.LG","abstract":"Localizing behaviors of neural networks to a subset of the network's components or a subset of interactions between components is a natural first step towards analyzing network mechanisms and possible failure modes. Existing work is often qualitative and ad-hoc, and there is no consensus on the appropriate way to evaluate localization claims. We introduce path patching, a technique for expressing and quantitatively testing a natural class of hypotheses expressing that behaviors are localized to a set of paths. We refine an explanation of induction heads, characterize a behavior of GPT-2, and open source a framework for efficiently running similar experiments.","external_url":"https://arxiv.org/abs/2304.05969","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-22T07:44:42.748639+00:00","pith_arxiv_id":"2304.05969","created_at":"2026-05-09T06:25:45.378771+00:00","updated_at":"2026-05-22T07:44:42.748639+00:00","title_quality_ok":true,"display_title":"Localizing Model Behavior with Path Patching","render_title":"Localizing Model Behavior with Path Patching"},"hub":{"state":{"work_id":"fdd07939-8060-4e9e-bd06-ff125eae86ef","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":24,"external_cited_by_count":null,"distinct_field_count":5,"first_pith_cited_at":"2023-09-27T21:53:56+00:00","last_pith_cited_at":"2026-05-21T16:55:27+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-04T08:17:27.643191+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":3},{"context_role":"method","n":3}],"polarity_counts":[{"context_polarity":"background","n":3},{"context_polarity":"use_method","n":3}],"runs":{},"summary":{},"graph":{},"authors":[]}}