A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher · 2016 · cs.CL · arXiv 1611.01587

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. Higher layers include shortcut connections to lower-level task predictions to reflect linguistic hierarchies. We use a simple regularization term to allow for optimizing all model weights to improve one task's loss without exhibiting catastrophic interference of the other tasks. Our single end-to-end model obtains state-of-the-art or competitive results on five different tasks from tagging, parsing, relatedness, and entailment tasks.

representative citing papers

Multitask Prompted Training Enables Zero-Shot Task Generalization

cs.LG · 2021-10-15 · conditional · novelty 7.0

Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

cs.CL · 2017-08-31 · conditional · novelty 7.0

Seq2SQL uses deep learning plus reinforcement learning to generate SQL from natural language, reaching 59.4% execution accuracy on the new WikiSQL dataset of 80k examples.

citing papers explorer

Showing 2 of 2 citing papers.

Multitask Prompted Training Enables Zero-Shot Task Generalization cs.LG · 2021-10-15 · conditional · none · ref 18 · internal anchor
Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning cs.CL · 2017-08-31 · conditional · none · ref 13
Seq2SQL uses deep learning plus reinforcement learning to generate SQL from natural language, reaching 59.4% execution accuracy on the new WikiSQL dataset of 80k examples.

A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

fields

years

verdicts

representative citing papers

citing papers explorer