Recognition: 1 theorem link
Editing Models with Task Arithmetic
Pith reviewed 2026-05-13 08:05 UTC · model grok-4.3
The pith
Task vectors steer pre-trained models by adding, subtracting, and combining directions in weight space.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A task vector is obtained by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a given task. Arithmetic on these vectors steers behavior: negation reduces accuracy on the target task, addition improves accuracy on multiple tasks simultaneously, and when tasks satisfy an analogy of the form A is to B as C is to D, the combination of three vectors raises performance on the fourth task without any training data from it.
What carries the argument
Task vectors, defined as the weight difference between a fine-tuned model and its pre-trained base, which function as linear directions in parameter space that combine via addition and negation to alter task performance.
If this is right
- Negating a task vector lowers performance on its associated task while leaving performance on unrelated tasks largely unchanged.
- Adding several task vectors raises performance on each of the corresponding tasks at the same time.
- Vector combinations derived from task analogies improve accuracy on a fourth task even when no examples from that task are used.
- The same arithmetic operations apply across different model architectures and data modalities in the reported experiments.
Where Pith is reading between the lines
- A library of pre-computed task vectors could allow quick assembly of custom models by selecting and combining desired directions.
- Negating vectors linked to biased or undesired behaviors offers a route to debiasing without new labeled data.
- The method suggests that task adaptations may remain modular enough to support sequential additions or removals of capabilities.
Load-bearing premise
Directions in weight space for different tasks add together with little destructive interference.
What would settle it
Run the analogy experiment on a held-out task and observe that the combined vector yields no accuracy gain over the plain pre-trained model.
read the original abstract
Changing how pre-trained models behave -- e.g., improving their performance on a downstream task or mitigating biases learned during pre-training -- is a common practice when developing machine learning systems. In this work, we propose a new paradigm for steering the behavior of neural networks, centered around \textit{task vectors}. A task vector specifies a direction in the weight space of a pre-trained model, such that movement in that direction improves performance on the task. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition, and the behavior of the resulting model is steered accordingly. Negating a task vector decreases performance on the target task, with little change in model behavior on control tasks. Moreover, adding task vectors together can improve performance on multiple tasks at once. Finally, when tasks are linked by an analogy relationship of the form ``A is to B as C is to D", combining task vectors from three of the tasks can improve performance on the fourth, even when no data from the fourth task is used for training. Overall, our experiments with several models, modalities and tasks show that task arithmetic is a simple, efficient and effective way of editing models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces task vectors, defined as the difference between the weights of a model fine-tuned on a task and the weights of the corresponding pre-trained model. It demonstrates that these vectors support arithmetic operations such as negation (which decreases performance on the target task) and addition (which can improve performance on multiple tasks simultaneously). For tasks related by an analogy of the form 'A is to B as C is to D', the paper shows that combining task vectors from three tasks can improve performance on the fourth task without using any data from it. Experiments are reported across multiple models, modalities, and tasks.
Significance. If the empirical results hold, the work provides a simple, efficient method for editing pre-trained models without full retraining or access to task data in some cases. The multi-task addition and analogy-based editing results are particularly notable, as they suggest a form of weight-space compositionality that could reduce the need for task-specific fine-tuning. The experiments across models and modalities lend concrete support to the central claims.
major comments (2)
- [Experiments] Experiments section: the scaling coefficients used for vector addition are selected post-hoc for each reported result; this choice directly affects the magnitude of the claimed gains and should be accompanied by a sensitivity analysis or default selection rule to avoid the appearance of tuning to the test set.
- [Analogy experiments] Analogy experiments (the fourth-task improvement results): error bars or multiple random seeds are not reported for all gains; without them it is difficult to assess whether the observed improvements on the held-out task are statistically reliable or could be explained by variance in the base fine-tuning runs.
minor comments (2)
- [§3] Notation for task vectors should be introduced once with a clear equation (e.g., τ = θ_fine − θ_pre) and then used consistently; occasional redefinition in later sections reduces readability.
- [Figures] Several figures would benefit from explicit annotation of the scaling coefficient value used in each plotted curve.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work and the constructive comments. We address each major comment below.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the scaling coefficients used for vector addition are selected post-hoc for each reported result; this choice directly affects the magnitude of the claimed gains and should be accompanied by a sensitivity analysis or default selection rule to avoid the appearance of tuning to the test set.
Authors: We agree that the scaling coefficients warrant additional justification. In the manuscript, coefficients were chosen based on validation performance for each combination, following common practice for such methods. To strengthen the presentation, we will add a sensitivity analysis in the revised version showing performance across a range of coefficients (e.g., 0.0 to 2.0) for the primary multi-task and analogy results. We will also state a default rule of using coefficient 1.0 when no validation data is available. revision: yes
-
Referee: [Analogy experiments] Analogy experiments (the fourth-task improvement results): error bars or multiple random seeds are not reported for all gains; without them it is difficult to assess whether the observed improvements on the held-out task are statistically reliable or could be explained by variance in the base fine-tuning runs.
Authors: We acknowledge the value of reporting variability for assessing reliability. The original experiments used single runs primarily due to computational cost. In the revision, we will rerun the analogy experiments with at least three random seeds, report mean performance with standard deviation error bars, and confirm that the observed gains remain statistically reliable. revision: yes
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper defines task vectors explicitly as the difference between fine-tuned and pre-trained weights, then demonstrates their arithmetic properties (negation, addition, and analogy-based combinations) through direct empirical evaluation on held-out test sets across multiple models and tasks. No equations or claims reduce a 'prediction' to a fitted parameter by construction, and the central results (including analogy editing without fourth-task data) are measured independently rather than derived tautologically from the inputs. There are no load-bearing self-citations, uniqueness theorems, or ansatzes that collapse the argument to prior author work. The work is self-contained as an experimental paradigm with falsifiable measurements.
Axiom & Free-Parameter Ledger
free parameters (1)
- scaling coefficient for vector addition
axioms (1)
- domain assumption Task directions in parameter space are sufficiently linear and additive for the tested models and tasks.
invented entities (1)
-
task vector
no independent evidence
Forward citations
Cited by 34 Pith papers
-
Defenses at Odds: Measuring and Explaining Defense Conflicts in Large Language Models
Sequential LLM defense deployment leads to risk exacerbation in 38.9% of cases due to anti-aligned updates in shared critical layers, addressed by conflict-guided layer freezing.
-
Crafting Reversible SFT Behaviors in Large Language Models
LCDD creates sparse carriers for SFT behaviors that SFT-Eraser can reverse, with ablations showing the sparse structure enables causal control.
-
Discovering Physical Directions in Weight Space: Composing Neural PDE Experts
Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.
-
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
-
Good Agentic Friends Do Not Just Give Verbal Advice: They Can Update Your Weights
TFlow enables multi-agent LLMs to collaborate via transient low-rank LoRA perturbations derived from sender activations, yielding up to 8.5 accuracy gains and 83% token reduction versus text-based baselines on Qwen3-4...
-
CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models
Capability vectors extracted from parameter differences between standard and auxiliary-finetuned VLA models can be merged into pretrained weights to match auxiliary-training performance while reducing computational ov...
-
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
-
Generalizing the Geometry of Model Merging Through Frechet Averages
Model merging is reframed as Fréchet averaging on manifolds whose geometry respects architectural symmetries, generalizing Fisher merging and enabling better LoRA merges.
-
Generalizing the Geometry of Model Merging Through Frechet Averages
Model merging is generalized as Fréchet averaging on symmetry-invariant manifolds, containing Fisher merging as a special case and offering a new approach for LoRA adapters.
-
Atomic-Probe Governance for Skill Updates in Compositional Robot Policies
A cross-version swap protocol reveals dominant skills that swing composition success by up to 50 percentage points, and an atomic probe with selective revalidation governs updates at lower cost than always re-testing ...
-
Differentially Private Model Merging
Post-processing via random selection or linear combination generates differentially private models for arbitrary privacy parameters from pre-trained models on the same dataset.
-
Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation
Translation function vectors extracted from English to one target language improve correct token ranking for translations to multiple other unseen target languages in decoder-only multilingual LLMs.
-
One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model Merging
Merging fine-tuned models for multilingual translation fails because fine-tuning redistributes language-specific neurons rather than sharpening them, increasing representational divergence in output-generating layers.
-
Internalized Reasoning for Long-Context Visual Document Understanding
A synthetic pipeline creates and internalizes reasoning traces in VLMs for long-context visual document understanding, with a 32B model surpassing a 235B model on MMLongBenchDoc and showing 12.4x fewer output tokens.
-
Refusal in Language Models Is Mediated by a Single Direction
Refusal in language models is mediated by a single direction in residual stream activations that can be erased to disable safety or added to elicit refusal.
-
Scalable Token-Level Hallucination Detection in Large Language Models
TokenHD uses a scalable data synthesis engine and importance-weighted training to create token-level hallucination detectors that work on free-form text and scale from 0.6B to 8B parameters, outperforming larger reaso...
-
Experience Sharing in Mutual Reinforcement Learning for Heterogeneous Language Models
Mutual Reinforcement Learning allows heterogeneous LLMs to exchange experience through mechanisms like Peer Rollout Pooling, Cross-Policy GRPO Advantage Sharing, and Success-Gated Transfer, with outcome-level sharing ...
-
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
UniVidX unifies diverse video generation tasks into one conditional diffusion model using stochastic condition masking, decoupled gated LoRAs, and cross-modal self-attention.
-
Atomic-Probe Governance for Skill Updates in Compositional Robot Policies
Empirical study on robosuite tasks reveals a dominant-skill effect in compositions and shows that an atomic probe approximates full revalidation for skill updates at much lower cost.
-
Separable Expert Architecture: Toward Privacy-Preserving LLM Personalization via Composable Adapters and Deletable User Proxies
A separable expert architecture uses base models, LoRA adapters, and deletable per-user proxies to enable privacy-preserving personalization and deterministic unlearning in LLMs.
-
AlignCultura: Towards Culturally Aligned Large Language Models?
Align-Cultura introduces the CULTURAX dataset and shows that culturally fine-tuned LLMs improve joint HHH scores by 4-6%, cut cultural failures by 18%, and gain 10-12% efficiency with minimal leakage.
-
Train Separately, Merge Together: Modular Post-Training with Mixture-of-Experts
BAR trains independent domain experts via separate mid-training, SFT, and RL pipelines then composes them with a MoE router to match monolithic retraining performance at lower cost and without catastrophic forgetting.
-
PivotMerge: Bridging Heterogeneous Multimodal Pre-training via Post-Alignment Model Merging
PivotMerge merges heterogeneous multimodal pre-trained models via shared-space decomposition to filter conflicts and layer-wise weights based on alignment contributions, outperforming baselines on multimodal benchmarks.
-
Weight Patching: Toward Source-Level Mechanistic Localization in LLMs
Weight Patching localizes capabilities to specific parameter modules in LLMs by replacing weights from a behavior-specialized model into a base model and validating recovery via a vector-anchor interface, revealing a ...
-
WIN-U: Woodbury-Informed Newton-Unlearning as a retain-free Machine Unlearning Framework
WIN-U delivers a retain-free unlearning update that approximates the gold-standard retrained model via a Woodbury-informed Newton step using only forget-set curvature information.
-
The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment
The Master Key Hypothesis states that capabilities are low-dimensional directions transferable across models through linear subspace alignment, with UNLOCK demonstrating gains such as 12.1% accuracy improvement on MAT...
-
Analytic Drift Resister for Non-Exemplar Continual Graph Learning
ADR achieves theoretically zero-forgetting class-incremental graph learning by combining backpropagation adaptation with ridge-regression-based layer-wise merging of GNN linear transformations.
-
GeoStack: A Framework for Quasi-Abelian Knowledge Composition in VLMs
GeoStack composes multiple domain experts into VLMs with preserved base knowledge and O(1) inference time via geometric stacking and a weight-folding property.
-
UNSEEN: A Cross-Stack LLM Unlearning Defense against AR-LLM Social Engineering Attacks
UNSEEN combines AR access control, LLM unlearning to suppress profiles, and agent guardrails to defend against AR-LLM social engineering attacks, tested in a 60-person user study with 360 conversations.
-
HiP-LoRA: Budgeted Spectral Plasticity for Robust Low-Rank Adaptation
HiP-LoRA decomposes LoRA updates into principal and residual spectral channels with a singular-value-weighted stability budget to reduce forgetting and interference during foundation model adaptation.
-
MAny: Merge Anything for Multimodal Continual Instruction Tuning
MAny addresses dual-forgetting in multimodal continual instruction tuning via CPM and LPM merging strategies, delivering up to 8.57% accuracy gains on UCIT benchmarks without additional training.
-
FREE-Switch: Frequency-based Dynamic LoRA Switch for Style Transfer
FREE-Switch dynamically switches LoRA adapters using frequency importance per diffusion step and adds semantic alignment to reduce content drift when merging specialized image generators.
-
SHIFT: Steering Hidden Intermediates in Flow Transformers
SHIFT learns and applies steering vectors to selected layers and timesteps in DiT models to suppress concepts, shift styles, or bias objects while keeping image quality and prompt adherence intact.
-
MOMO: Mars Orbital Model Foundation Model for Mars Orbital Applications
MOMO merges sensor-specific models from three Mars orbital instruments at matched validation loss stages to form a foundation model that outperforms ImageNet, Earth observation, sensor-specific, and supervised baselin...
Reference graph
Works this paper leans on
-
[1]
Task2vec: Task embedding for meta-learning
Alessandro Achille, Michael Lam, Rahul Tewari, Avinash Ravichandran, Subhransu Maji, Charless C Fowlkes, Stefano Soatto, and Pietro Perona. Task2vec: Task embedding for meta-learning. In International Conference on Computer Vision (ICCV) , 2019. https: //arxiv.org/abs/1902.03545
-
[2]
K., Hayase, J., and Srinivasa, S
Samuel K Ainsworth, Jonathan Hayase, and Siddhartha Srinivasa. Git re-basin: Merging mod- els modulo permutation symmetries, 2022. https://arxiv.org/abs/2209.04836
-
[3]
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, et al. Flamingo: a visual language model for few-shot learning, 2022. https://arxiv.org/abs/2204.14198
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[4]
A General Language Assistant as a Laboratory for Alignment
Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, et al. A general language assistant as a laboratory for alignment, 2021. https://arxiv.org/abs/2112.00861
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[5]
The second pascal recognising textual entailment challenge
Roy Bar-Haim, Ido Dagan, Bill Dolan, Lisa Ferro, Danilo Giampiccolo, Bernardo Magnini, and Idan Szpektor. The second pascal recognising textual entailment challenge. In II PASCAL challenge, 2006
work page 2006
-
[6]
The fifth pascal recognizing textual entailment challenge
Luisa Bentivogli, Peter Clark, Ido Dagan, and Danilo Giampiccolo. The fifth pascal recognizing textual entailment challenge. In TAC, 2009. https://cris.fbk.eu/handle/11582/ 5351
work page 2009
-
[7]
Loss sur- face simplexes for mode connecting volumes and fast ensembling
Gregory Benton, Wesley Maddox, Sanae Lotfi, and Andrew Gordon Gordon Wilson. Loss sur- face simplexes for mode connecting volumes and fast ensembling. InInternational Conference on Machine Learning (ICML), 2021. https://arxiv.org/abs/2102.13042
-
[8]
Nuanced metrics for measuring unintended bias with real data for text classification
Daniel Borkan, Lucas Dixon, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. Nuanced metrics for measuring unintended bias with real data for text classification. In Companion Proceedings of the 2019 World Wide Web Conference, 2019. https://arxiv.org/abs/ 1903.04561
-
[9]
Language Models are Few-Shot Learners
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott G...
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[10]
Remote sensing image scene classification: Benchmark and state of the art
Gong Cheng, Junwei Han, and Xiaoqiang Lu. Remote sensing image scene classification: Benchmark and state of the art. Proceedings of the Institute of Electrical and Electronics En- gineers (IEEE), 2017. https://ieeexplore.ieee.org/abstract/document/ 7891544
work page 2017
-
[11]
Fusing finetuned models for better pretraining,
Leshem Choshen, Elad Venezian, Noam Slonim, and Yoav Katz. Fusing finetuned models for better pretraining, 2022. https://arxiv.org/abs/2204.03044. 10 Published as a conference paper at ICLR 2023
-
[12]
Describing textures in the wild
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. In Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2014. https://openaccess.thecvf.com/content_cvpr_2014/ html/Cimpoi_Describing_Textures_in_2014_CVPR_paper.html
work page 2014
-
[13]
A deep neural network’s loss surface contains every low-dimensional pattern, 2019
Wojciech Marian Czarnecki, Simon Osindero, Razvan Pascanu, and Max Jaderberg. A deep neural network’s loss surface contains every low-dimensional pattern, 2019. https: //arxiv.org/abs/1912.07559
-
[14]
The PASCAL recognising textual entailment challenge
Ido Dagan, Oren Glickman, and Bernardo Magnini. The pascal recognising textual entailment challenge. In Machine Learning Challenges Workshop, 2005. https://link.springer. com/chapter/10.1007/11736790_9
-
[15]
Editing Factual Knowledge in Language Models, September 2021
Nicola De Cao, Wilker Aziz, and Ivan Titov. Editing factual knowledge in language models. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021. https: //arxiv.org/abs/2104.08164
-
[16]
Imagenet: A large- scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large- scale hierarchical image database. In Conference on Computer Vision and Pattern Recog- nition (CVPR), 2009. https://ieeexplore.ieee.org/abstract/document/ 5206848
work page 2009
-
[17]
BERT: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019. https: //aclanthology.org/N19-1423
work page 2019
- [18]
-
[19]
William B. Dolan and Chris Brockett. Automatically constructing a corpus of sentential para- phrases. In International Workshop on Paraphrasing, 2005. https://aclanthology. org/I05-5002
work page 2005
-
[20]
Cold fusion: Collaborative descent for distributed multitask finetuning, 2022
Shachar Don-Yehiya, Elad Venezian, Colin Raffel, Noam Slonim, Yoav Katz, and Leshem Choshen. Cold fusion: Collaborative descent for distributed multitask finetuning, 2022. https://arxiv.org/abs/2212.01378
-
[21]
Essentially no barriers in neural network energy landscape
Felix Draxler, Kambis Veschgini, Manfred Salmhofer, and Fred Hamprecht. Essentially no barriers in neural network energy landscape. InInternational Conference on Machine Learning (ICML), 2018. https://arxiv.org/abs/1803.00885
-
[22]
How do humans sketch objects? ACM Trans- actions on graphics (TOG), 2012
Mathias Eitz, James Hays, and Marc Alexa. How do humans sketch objects? ACM Trans- actions on graphics (TOG), 2012. https://dl.acm.org/doi/10.1145/2185520. 2185540
-
[23]
arXiv preprint arXiv:2110.06296 , year=
Rahim Entezari, Hanie Sedghi, Olga Saukh, and Behnam Neyshabur. The role of permutation invariance in linear mode connectivity of neural networks. In International Conference on Learning Representations (ICLR), 2022. https://arxiv.org/abs/2110.06296
-
[24]
Fabbri, Irene Li, Tianwei She, Suyi Li, and Dragomir R
Alexander R. Fabbri, Irene Li, Tianwei She, Suyi Li, and Dragomir R. Radev. Multi-news: a large-scale multi-document summarization dataset and abstractive hierarchical model, 2019. https://arxiv.org/abs/1906.01749
-
[25]
Deep ensembles: A loss landscape perspective, 2019
Stanislav Fort, Huiyi Hu, and Balaji Lakshminarayanan. Deep ensembles: A loss landscape perspective, 2019. https://arxiv.org/abs/1912.02757
-
[26]
Stanislav Fort, Gintare Karolina Dziugaite, Mansheej Paul, Sepideh Kharaghani, Daniel M Roy, and Surya Ganguli. Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the neural tangent kernel. In Advances in Neural Information Processing Systems (NeurIPS) , 2020. https://arxiv.org/abs/2010. 15110. 11...
work page 2020
-
[27]
Linear mode connectivity and the lottery ticket hypothesis
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel Roy, and Michael Carbin. Linear mode connectivity and the lottery ticket hypothesis. In International Conference on Machine Learning (ICML), 2020. https://proceedings.mlr.press/v119/frankle20a. html
work page 2020
-
[28]
Loss surfaces, mode connectivity, and fast ensembling of dnns
Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry Vetrov, and Andrew Gordon Wilson. Loss surfaces, mode connectivity, and fast ensembling of dnns. In Advances in Neural Information Processing Systems (NeurIPS) , 2018. https://arxiv.org/abs/1802. 10026
work page 2018
-
[29]
Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A. Smith. Re- alToxicityPrompts: Evaluating neural toxic degeneration in language models. In Find- ings of the Association for Computational Linguistics: EMNLP 2020 , 2020. https: //aclanthology.org/2020.findings-emnlp.301
work page 2020
-
[30]
Mor Geva, Avi Caciularu, Guy Dar, Paul Roit, Shoval Sadde, Micah Shlain, Bar Tamir, and Yoav Goldberg. Lm-debugger: An interactive tool for inspection and intervention in transformer-based language models, 2022. https://arxiv.org/abs/2204.12130
-
[31]
The third pascal recog- nizing textual entailment challenge
Danilo Giampiccolo, Bernardo Magnini, Ido Dagan, and Bill Dolan. The third pascal recog- nizing textual entailment challenge. In ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, 2007. https://aclanthology.org/W07-1401/
work page 2007
-
[32]
Improving alignment of dialogue agents via targeted human judgements, 2022
Amelia Glaese, Nat McAleese, Maja Trebacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Mari- beth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Sona Mokra, Ni...
work page 2022
-
[33]
Karan Goel, Albert Gu, Yixuan Li, and Christopher R ´e. Model patching: Closing the sub- group performance gap with data augmentation, 2020.https://arxiv.org/abs/2008. 06775
work page 2020
-
[34]
Eternal sunshine of the spotless net: Selective forgetting in deep networks
Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Conference on Computer Vision and Pattern Recognition (CVPR), 2020. https://arxiv.org/abs/1911.04933
-
[35]
Laura Hanu and Unitary team. Detoxify, 2020. https://github.com/unitaryai/ detoxify
work page 2020
-
[36]
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. Journal of Selected Topics in Applied Earth Observations and Remote Sensing , 2019. https:// arxiv.org/abs/1709.00029
work page Pith review arXiv 2019
-
[37]
Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. Natural adversarial examples. In Conference on Computer Vision and Pattern Recognition (CVPR), 2021
work page 2021
-
[38]
Training Compute-Optimal Large Language Models
Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training compute-optimal large language models, 2022. https://arxiv.org/abs/ 2203.15556
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[39]
Patching open-vocabulary models by interpolating weights
Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, and Ludwig Schmidt. Patching open-vocabulary models by interpolating weights. In Advances in Neural Information Processing Systems (NeurIPS), 2022. https://arXiv.org/abs/2208.05592. 12 Published as a conference paper at ICLR 2023
-
[40]
Averaging weights leads to wider optima and better generalization
Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. Averaging weights leads to wider optima and better generalization. In Conference on Uncertainty in Artificial Intelligence (UAI), 2018. https://arxiv.org/abs/1803. 05407
work page 2018
-
[41]
Neural tangent kernel: Convergence and generalization in neural networks
Arthur Jacot, Franck Gabriel, and Cl´ement Hongler. Neural tangent kernel: Convergence and generalization in neural networks. In Advances in Neural Information Processing Systems (NeurIPS), 2018. https://arxiv.org/abs/1806.07572
-
[42]
Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V Le, Yunhsuan Sung, Zhen Li, and Tom Duerig. Scaling up visual and vision-language representa- tion learning with noisy text supervision. In International Conference on Machine Learning (ICML), 2021. https://arxiv.org/abs/2102.05918
-
[43]
Linear connectivity reveals generalization strategies, 2022
Jeevesh Juneja, Rachit Bansal, Kyunghyun Cho, Jo ˜ao Sedoc, and Naomi Saphra. Linear connectivity reveals generalization strategies, 2022. https://arxiv.org/abs/2205. 12411/
work page 2022
-
[44]
In conversation with artificial intelligence: aligning language models with human values, 2022
Atoosa Kasirzadeh and Iason Gabriel. In conversation with artificial intelligence: aligning language models with human values, 2022. https://arxiv.org/abs/2209.00731
-
[45]
UNIFIEDQA: Crossing format boundaries with a single QA system
Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, and Hannaneh Hajishirzi. UNIFIEDQA: Crossing format boundaries with a single QA system. In Findings of the Association for Computational Linguistics (EMNLP) , 2020. https: //aclanthology.org/2020.findings-emnlp.171
work page 2020
-
[46]
Qasc: A dataset for question answering via sentence composition, 2020
Tushar Khot, Peter Clark, Michal Guerquin, Peter Jansen, and Ashish Sabharwal. Qasc: A dataset for question answering via sentence composition, 2020. https://arxiv.org/ abs/1910.11473v2
-
[47]
3d object representations for fine- grained categorization
Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3d object representations for fine- grained categorization. In International Conference on Computer Vision Workshops (ICML),
-
[48]
https://www.cv-foundation.org/openaccess/content_iccv_ workshops_2013/W19/html/Krause_3D_Object_Representations_ 2013_ICCV_paper.html
-
[49]
Explaining landscape connectivity of low-cost solutions for multilayer nets
Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Rong Ge, and Sanjeev Arora. Explaining landscape connectivity of low-cost solutions for multilayer nets. Advances in Neural Information Processing Systems (NeurIPS), 2019. https://arxiv. org/abs/1906.06247
-
[50]
RACE: Large- scale ReAding comprehension dataset from examinations
Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. RACE: Large- scale ReAding comprehension dataset from examinations. In Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2017. https://aclanthology. org/D17-1082
work page 2017
-
[51]
Transforming task representations to perform novel tasks
Andrew K Lampinen and James L McClelland. Transforming task representations to perform novel tasks. Proceedings of the National Academy of Sciences, 2020
work page 2020
-
[52]
The mnist database of handwritten digits, 1998
Yann LeCun. The mnist database of handwritten digits, 1998. http://yann.lecun. com/exdb/mnist/
work page 1998
-
[53]
The power of scale for parameter-efficient prompt tuning
Brian Lester, Rami Al-Rfou, and Noah Constant. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 3045–3059, Online and Punta Cana, Dominican Republic, November
work page 2021
-
[54]
doi: 10.18653/v1/2021.emnlp-main.243
Association for Computational Linguistics. doi: 10.18653/v1/2021.emnlp-main.243. URL https://aclanthology.org/2021.emnlp-main.243
-
[55]
Datasets: A community library for natural language processing
Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu, Lewis Tunstall, Joe Davison, Mario ˇSaˇsko, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugge...
-
[56]
Visualizing the loss landscape of neural nets
Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems (NeurIPS) ,
-
[57]
https://arxiv.org/abs/1712.09913
-
[58]
Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A Smith, and Luke Zettlemoyer. Branch-train-merge: Embarrassingly parallel training of expert language models, 2022. https://arxiv.org/abs/2208.03306
-
[59]
CommonGen: A constrained text generation challenge for generative com- monsense reasoning
Bill Yuchen Lin, Wangchunshu Zhou, Ming Shen, Pei Zhou, Chandra Bhagavatula, Yejin Choi, and Xiang Ren. CommonGen: A constrained text generation challenge for generative com- monsense reasoning. In Findings of the Association for Computational Linguistics: EMNLP,
-
[60]
https://www.aclweb.org/anthology/2020.findings-emnlp.165
work page 2020
-
[61]
Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, and Yejin Choi. DExperts: Decoding-time controlled text generation with experts and anti-experts. In Annual Meeting of the Association for Computational Linguistics (ACL), 2021. https://aclanthology.org/2021.acl-long.522
work page 2021
-
[62]
Decoupled weight decay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR) , 2019. URL https://openreview. net/forum?id=Bkg6RiCqY7
work page 2019
-
[63]
Quark: Controllable text generation with reinforced unlearning,
Ximing Lu, Sean Welleck, Liwei Jiang, Jack Hessel, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, and Yejin Choi. Quark: Controllable text generation with reinforced unlearning,
- [64]
-
[65]
arXiv preprint arXiv:2211.08422 , year=
Ekdeep Singh Lubana, Eric J Bigelow, Robert P Dick, David Krueger, and Hidenori Tanaka. Mechanistic mode connectivity, 2022. https://arxiv.org/abs/2211.08422
-
[66]
Analyzing monotonic linear interpolation in neural network loss landscapes, 2021
James Lucas, Juhan Bae, Michael R Zhang, Stanislav Fort, Richard Zemel, and Roger Grosse. Analyzing monotonic linear interpolation in neural network loss landscapes, 2021. https: //arxiv.org/abs/2104.11044
-
[67]
Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y . Ng, and Christo- pher Potts. Learning word vectors for sentiment analysis. In Annual Meeting of the Association for Computational Linguistics (ACL), 2011. http://www.aclweb.org/anthology/ P11-1015
work page 2011
-
[68]
Merging models with fisher-weighted averaging
Michael Matena and Colin Raffel. Merging models with fisher-weighted averaging. In Advances in Neural Information Processing Systems (NeurIPS), 2021. https://arxiv. org/abs/2111.09832
-
[69]
Comparison of the predicted and observed secondary structure of t4 phage lysozyme
Brian W Matthews. Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure, 1975. https://www. sciencedirect.com/science/article/abs/pii/0005279575901099
-
[70]
Julian McAuley and Jure Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. In ACM Conference on Recommender Systems, 2013. https: //dl.acm.org/doi/10.1145/2507157.2507163
-
[71]
Pointer Sentinel Mixture Models
Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. Pointer sentinel mixture models, 2016. https://arxiv.org/abs/1609.07843
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[72]
MetaICL: Learn- ing to learn in context
Sewon Min, Mike Lewis, Luke Zettlemoyer, and Hannaneh Hajishirzi. MetaICL: Learn- ing to learn in context. In Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022. https://aclanthology.org/2022. naacl-main.201. 14 Published as a conference paper at ICLR 2023
work page 2022
-
[73]
Cross-task general- ization via natural language crowdsourcing instructions
Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. Cross-task general- ization via natural language crowdsourcing instructions. In Annual Meeting of the Associa- tion for Computational Linguistics (ACL), 2022. https://aclanthology.org/2022. acl-long.244
work page 2022
-
[74]
Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, and Christopher D Manning. Fast model editing at scale. In International Conference on Learning Representations (ICLR), 2021. https://arxiv.org/abs/2110.11309
-
[75]
Memory-based model editing at scale
Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D Manning, and Chelsea Finn. Memory-based model editing at scale. In International Conference on Machine Learning,
- [76]
-
[77]
Fixing model bugs with natural language patches
Shikhar Murty, Christopher D Manning, Scott Lundberg, and Marco Tulio Ribeiro. Fixing model bugs with natural language patches. In ACL Workshop on Learning with Natural Lan- guage Supervision, 2022. https://openreview.net/forum?id=blJrg3WvvDV
work page 2022
-
[78]
Read- ing digits in natural images with unsupervised feature learning
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Read- ing digits in natural images with unsupervised feature learning. InAdvances in Neural Informa- tion Processing Systems (NeurIPS) Workshops, 2011. https://storage.googleapis. com/pub-tools-public-publication-data/pdf/37648.pdf
work page 2011
-
[79]
Behnam Neyshabur, Hanie Sedghi, and Chiyuan Zhang. What is being transferred in transfer learning? In Advances in Neural Information Processing Systems (NeurIPS), 2020. https: //arxiv.org/abs/2008.11687
-
[80]
Training language models to follow instructions with human feedback, 2022
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback, 2022. https://arxiv.org/abs/2203. 02155
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.