Recognition: unknown
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
read the original abstract
Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. Higher layers include shortcut connections to lower-level task predictions to reflect linguistic hierarchies. We use a simple regularization term to allow for optimizing all model weights to improve one task's loss without exhibiting catastrophic interference of the other tasks. Our single end-to-end model obtains state-of-the-art or competitive results on five different tasks from tagging, parsing, relatedness, and entailment tasks.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.
-
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
Seq2SQL uses deep learning plus reinforcement learning to generate SQL from natural language, reaching 59.4% execution accuracy on the new WikiSQL dataset of 80k examples.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.