Towards Compute-Optimal Transfer Learning

Alexandre Galashov; Amal Rannen-Triki; Arthur Douillard; Dushyant Rao; Laurent Charlin; Marc'Aurelio Ranzato; Massimo Caccia; Michela Paganini; Razvan Pascanu

arxiv: 2304.13164 · v1 · pith:OGHNHP5Unew · submitted 2023-04-25 · 💻 cs.LG · cs.AI

Towards Compute-Optimal Transfer Learning

Massimo Caccia , Alexandre Galashov , Arthur Douillard , Amal Rannen-Triki , Dushyant Rao , Michela Paganini , Laurent Charlin , Marc'Aurelio Ranzato

show 1 more author

Razvan Pascanu

This is my paper

classification 💻 cs.LG cs.AI

keywords learningmodelsperformancecomputationalpretrainedtransfercomputeefficiency

0 comments

read the original abstract

The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.

This paper has not been read by Pith yet.

Towards Compute-Optimal Transfer Learning

discussion (0)