arxiv: 1709.00103 · v7 · submitted 2017-08-31 · 💻 cs.CL · cs.AI

Recognition: no theorem link

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

Victor Zhong , Caiming Xiong , Richard Socher

Authors on Pith no claims yet

Pith reviewed 2026-05-13 16:54 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords natural language to SQLSQL query generationreinforcement learningsequence to sequenceWikiSQL datasetpolicy gradientdatabase queryingstructured prediction

0 comments

The pith

Seq2SQL translates natural language questions into SQL queries by combining structured generation with reinforcement learning rewards from database executions, reaching 59.4 percent execution accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Seq2SQL, a neural network that maps natural language questions to SQL by first using the fixed structure of SQL to shrink the space of possible outputs. For parts of queries that have no natural order, such as selected columns or conditions, it switches to policy-based reinforcement learning whose reward signal comes from actually running the generated query on the target database and checking whether the result is correct. To make this training possible at scale, the authors release WikiSQL, a collection of more than eighty thousand question-query pairs spread across twenty-four thousand Wikipedia tables. If the approach works, non-experts could retrieve facts from relational databases simply by asking questions in ordinary language.

Core claim

Seq2SQL is a deep neural network for translating natural language questions to corresponding SQL queries that leverages the structure of SQL queries to significantly reduce the output space of generated queries. Moreover, it uses rewards from in-the-loop query execution over the database to learn a policy to generate unordered parts of the query. By applying policy-based reinforcement learning with a query execution environment to WikiSQL, Seq2SQL outperforms attentional sequence to sequence models, improving execution accuracy from 35.9 percent to 59.4 percent and logical form accuracy from 23.4 percent to 48.3 percent.

What carries the argument

Policy gradient reinforcement learning whose scalar reward is produced by executing the generated SQL query on the database and verifying whether the returned result matches the expected answer.

If this is right

Natural-language questions can be turned into executable SQL without requiring the user to know query syntax.
Unordered query elements such as column lists and WHERE conditions become learnable through execution feedback rather than exact token matching.
A dataset of eighty thousand examples is large enough to train a practical end-to-end model for this task.
Direct execution accuracy becomes the primary optimization target instead of token-level cross-entropy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same execution-reward loop could be applied to other structured generation problems such as API calls or data-manipulation scripts.
Extending the approach to full SQL including joins and nested subqueries would require richer reward signals that still remain cheap to compute.
Execution feedback may substitute for fine-grained supervision in any setting where the correctness of an output can be verified automatically.

Load-bearing premise

Rewards obtained by executing generated queries on the database provide a sufficiently dense and stable training signal for the policy, especially for the unordered components of SQL.

What would settle it

On a fresh test set of questions and tables, if the reinforcement-learning model fails to exceed the execution accuracy of a standard attentional sequence-to-sequence baseline, the central claim is falsified.

read the original abstract

A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such as SQL. We propose Seq2SQL, a deep neural network for translating natural language questions to corresponding SQL queries. Our model leverages the structure of SQL queries to significantly reduce the output space of generated queries. Moreover, we use rewards from in-the-loop query execution over the database to learn a policy to generate unordered parts of the query, which we show are less suitable for optimization via cross entropy loss. In addition, we will publish WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia. This dataset is required to train our model and is an order of magnitude larger than comparable datasets. By applying policy-based reinforcement learning with a query execution environment to WikiSQL, our model Seq2SQL outperforms attentional sequence to sequence models, improving execution accuracy from 35.9% to 59.4% and logical form accuracy from 23.4% to 48.3%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Seq2SQL shows execution-based RL can lift NL-to-SQL accuracy on a new large dataset, but the binary reward signal is sparse enough that the gains need closer checking.

read the letter

Seq2SQL is mainly useful because it releases WikiSQL, a dataset of 80k question-SQL pairs over 24k Wikipedia tables. That scale is an order of magnitude larger than earlier resources, so it gives the field something concrete to train and benchmark on. The model itself adds SQL structure to the decoder and switches to policy gradients with execution accuracy as the reward for the unordered parts like WHERE clauses. On the held-out test set this produces execution accuracy of 59.4% versus 35.9% for attentional seq2seq and logical-form accuracy of 48.3% versus 23.4%. Those are real measured improvements on a new, sizable test set.

Referee Report

2 major / 2 minor

Summary. The paper proposes Seq2SQL, a neural network architecture for translating natural language questions into SQL queries. It exploits the structure of SQL to constrain the output space and applies policy-gradient reinforcement learning with rewards obtained by executing generated queries against the database, specifically to optimize the generation of unordered query components such as WHERE clauses. The authors release the WikiSQL dataset (80,654 examples over 24,241 Wikipedia tables) and report that Seq2SQL improves execution accuracy from 35.9% to 59.4% and logical-form accuracy from 23.4% to 48.3% over attentional sequence-to-sequence baselines.

Significance. If the results hold under more rigorous evaluation, the work would be significant for semantic parsing and natural-language interfaces to databases. The release of WikiSQL provides a large-scale, publicly available benchmark that is an order of magnitude larger than prior datasets, and the demonstration that execution-based RL can improve performance on unordered SQL fragments offers a concrete direction for handling components that are poorly suited to cross-entropy training.

major comments (2)

[Model / RL training] Model section (policy-gradient training): the central claim that RL with database execution rewards outperforms cross-entropy for unordered query parts rests on the assumption that the binary 0/1 execution reward supplies a usable training signal. No variance-reduction baseline, reward shaping, or curriculum is described, yet the reward is zero for any error in column, operator, or value—precisely the sparse regime that dominates early training. This directly affects the validity of the reported 23.9-point execution-accuracy gain.
[Experiments] Experiments section (results tables): the reported test-set accuracies are given as single point estimates without error bars, multiple random seeds, or ablation studies that isolate the RL component from the structured decoding and column/aggregation predictors. Without these controls it is impossible to attribute the improvement from 35.9% to 59.4% specifically to the policy-gradient objective rather than other modeling choices.

minor comments (2)

[Dataset] The description of the WikiSQL annotation protocol and table sampling procedure is brief; additional statistics on question complexity and column-type distribution would help readers assess dataset difficulty.
[Figures] Figure captions and axis labels in the experimental plots could be expanded to include the exact metric definitions (execution vs. logical-form accuracy) for quick reference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on Seq2SQL. We address each major comment below and will revise the manuscript to strengthen the presentation of the RL component and experimental results.

read point-by-point responses

Referee: Model section (policy-gradient training): the central claim that RL with database execution rewards outperforms cross-entropy for unordered query parts rests on the assumption that the binary 0/1 execution reward supplies a usable training signal. No variance-reduction baseline, reward shaping, or curriculum is described, yet the reward is zero for any error in column, operator, or value—precisely the sparse regime that dominates early training. This directly affects the validity of the reported 23.9-point execution-accuracy gain.

Authors: We agree that the binary execution reward is sparse and that the original experiments did not include an explicit variance-reduction baseline. In practice the structured output constraints (separate predictors for SELECT, AGG, and WHERE) substantially reduce the effective search space, allowing the policy gradient to provide a usable signal even without shaping or curriculum. To address the concern directly, we will add a moving-average baseline to the REINFORCE update in the revised model section, rerun the experiments, and include a short discussion of reward sparsity. This revision will make the attribution of gains to the RL objective more robust. revision: partial
Referee: Experiments section (results tables): the reported test-set accuracies are given as single point estimates without error bars, multiple random seeds, or ablation studies that isolate the RL component from the structured decoding and column/aggregation predictors. Without these controls it is impossible to attribute the improvement from 35.9% to 59.4% specifically to the policy-gradient objective rather than other modeling choices.

Authors: We acknowledge that the original submission reported single-run point estimates. We will rerun all models with five random seeds, report mean and standard deviation, and add error bars to the tables. We will also insert a new ablation that trains the identical architecture (structured decoding + column/aggregation predictors) once with cross-entropy loss and once with the execution-based RL objective; the difference isolates the contribution of policy-gradient training. These additions will be included in the revised experiments section. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external WikiSQL dataset and database execution

full rationale

The paper introduces WikiSQL as a new, hand-annotated dataset and trains Seq2SQL by policy gradients whose reward is obtained by executing generated SQL against the actual database tables. Reported execution accuracy (59.4%) and logical-form accuracy (48.3%) are measured on held-out test splits using the same external execution environment. These quantities are therefore not equivalent to any fitted parameter or self-defined quantity inside the model; they are independent observables. No load-bearing self-citation, uniqueness theorem, or ansatz imported from prior author work appears in the derivation. The central improvement over attentional seq2seq is therefore an ordinary empirical comparison rather than a reduction by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that SQL queries possess reusable structural components that can be decoded separately and that execution rewards supply a usable learning signal for unordered query parts.

free parameters (1)

neural network hyperparameters
Standard deep-learning parameters whose specific values are not reported in the abstract.

axioms (2)

domain assumption SQL queries have a fixed high-level structure that can be exploited to reduce the generation search space
Invoked to justify the structured decoder design.
domain assumption Query execution provides a reliable scalar reward for policy optimization
Central to the reinforcement-learning component.

pith-pipeline@v0.9.0 · 5499 in / 1346 out tokens · 26484 ms · 2026-05-13T16:54:59.233529+00:00 · methodology

discussion (0)

Forward citations

Cited by 24 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification
cs.CL 2026-04 conditional novelty 8.0

Introduces the ODUTQA-MDC task with a 25k-pair benchmark and MAIC-TQA multi-agent framework for detecting and clarifying underspecified open-domain tabular questions via dialogue.
LEAF-SQL: Level-wise Exploration with Adaptive Fine-graining for Text-to-SQL Skeleton Prediction
cs.CL 2026-05 unverdicted novelty 7.0

LEAF-SQL uses level-wise exploration with adaptive fine-graining and dual agents to generate diverse SQL skeletons, reaching 71.6% execution accuracy on the BIRD benchmark and outperforming prior search- and skeleton-...
RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners
cs.CL 2026-04 conditional novelty 7.0

RSAT uses SFT on verified traces followed by GRPO with NLI faithfulness rewards to make 1-8B models produce verifiable table reasoning with cell citations, raising faithfulness 3.7x to 0.826.
RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners
cs.CL 2026-04 unverdicted novelty 7.0

RSAT makes 1-8B language models produce faithful table reasoning by training them to output structured steps with cell citations, using SFT followed by GRPO with an NLI-based faithfulness reward.
NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions
cs.DB 2026-04 conditional novelty 7.0

NL2SQLBench is a new modular benchmarking framework that evaluates LLM NL2SQL methods across three core modules on existing datasets, exposing large accuracy gaps and computational inefficiency.
LoRA: Low-Rank Adaptation of Large Language Models
cs.CL 2021-06 accept novelty 7.0

Adapting large language models by training only a low-rank decomposition BA added to frozen weight matrices matches full fine-tuning while cutting trainable parameters by orders of magnitude and adding no inference latency.
$\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin
cs.LG 2026-05 unverdicted novelty 6.0

ξ-DPO rewrites the preference objective as minimizing distance to optimal margins and defines reward as a chosen-to-rejected ratio, yielding a bounded, interpretable margin ξ set directly from the initial reward-gap d...
Every Step Counts: Step-Level Credit Assignment for Tool-Integrated Text-to-SQL
cs.CL 2026-05 unverdicted novelty 6.0

FineStep adds step-level process rewards and credit assignment to tool-augmented Text-to-SQL, achieving 3.25% higher execution accuracy than GRPO on BIRD while cutting redundant tool calls.
FINER-SQL: Boosting Small Language Models for Text-to-SQL
cs.DB 2026-05 unverdicted novelty 6.0

FINER-SQL boosts 3B-parameter small language models to 67.73% and 85% execution accuracy on BIRD and Spider benchmarks via dense memory and atomic rewards in group relative policy optimization, matching larger LLMs at...
EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement
cs.DB 2026-05 unverdicted novelty 6.0

EGRefine optimizes column renamings via execution-grounded verification and view materialization to recover Text-to-SQL accuracy lost to schema naming issues while guaranteeing query equivalence.
LeGo-Code: Can Modular Curriculum Learning Advance Complex Code Generation? Insights from Text-to-SQL
cs.AI 2026-04 unverdicted novelty 6.0

Modular curriculum learning with tier-specific adapters outperforms standard fine-tuning on complex Text-to-SQL queries in Spider and BIRD benchmarks by avoiding catastrophic forgetting.
ReCoQA: A Benchmark for Tool-Augmented and Multi-Step Reasoning in Real Estate Question and Answering
cs.CL 2026-04 unverdicted novelty 6.0

ReCoQA is a new large-scale benchmark for multi-step tool-augmented reasoning in real estate QA, accompanied by the HIRE-Agent hierarchical understand-plan-execute baseline.
SQL Query Engine: A Self-Healing LLM Pipeline for Natural Language to PostgreSQL Translation
cs.DB 2026-04 unverdicted novelty 6.0

A self-healing LLM pipeline for natural language to PostgreSQL translation achieves up to 9.3 percentage point accuracy gains on benchmarks through error diagnosis and anti-regression mechanisms.
AV-SQL: Decomposing Complex Text-to-SQL Queries with Agentic Views
cs.DB 2026-04 unverdicted novelty 6.0

AV-SQL uses a pipeline of LLM agents to generate intermediate CTE views that decompose complex Text-to-SQL queries, reaching 70.38% execution accuracy on Spider 2.0.
OmniTQA: A Cost-Aware System for Hybrid Query Processing over Semi-Structured Data
cs.DB 2026-04 unverdicted novelty 6.0

OmniTQA integrates LLM semantic reasoning as a first-class query operator with classical relational operators in a cost-aware planner for hybrid structured and semi-structured data.
Natural Language Interfaces for Spatial and Temporal Databases: A Comprehensive Overview of Methods, Taxonomy, and Future Directions
cs.DB 2026-03 unverdicted novelty 6.0

A literature survey that taxonomizes methods, datasets, and evaluation practices for natural language interfaces to geospatial and temporal databases while identifying recurring trends and future directions.
SecureMCP: A Policy-Enforced LLM Data Access Framework for AIoT Systems via Model Context Protocol
cs.CR 2026-05 unverdicted novelty 5.0

SecureMCP integrates RBAC with five sequential defense modules in an MCP server to achieve 82.3% policy compliance against adversarial LLM SQL queries in AIoT while preserving execution accuracy.
SCOPE:Planning for Hybrid Querying over Clinical Trial Data
cs.CL 2026-04 unverdicted novelty 5.0

SCOPE uses explicit multi-LLM planning to improve accuracy on 1,500 hybrid reasoning questions over clinical trial tables compared to zero-shot, few-shot, CoT, and agent baselines.
A Demonstration of SQLyzr: A Platform for Fine-Grained Text-to-SQL Evaluation and Analysis
cs.DB 2026-04 unverdicted novelty 5.0

SQLyzr is a new evaluation platform that adds diverse metrics, realistic settings, query classification, and analysis features to overcome the single-score limitations of existing text-to-SQL benchmarks.
FD-NL2SQL: Feedback-Driven Clinical NL2SQL that Improves with Use
cs.CL 2026-04 unverdicted novelty 5.0

FD-NL2SQL is a feedback-driven clinical NL2SQL system that decomposes questions, retrieves exemplars via embeddings, synthesizes SQL, and expands its example bank from user edits plus logic-based mutations to improve ...
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning
cs.CL 2026-04 unverdicted novelty 5.0

APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs
cs.CL 2026-04 unverdicted novelty 5.0

FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
StarCoder: may the source be with you!
cs.CL 2023-05 accept novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
Large Language Models: A Survey
cs.CL 2024-02 accept novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · cited by 23 Pith papers · 1 internal anchor

[1]

Androutsopoulos, G.D

I. Androutsopoulos, G.D. Ritchie, and P. Thanisch. Natural language interfaces to databases - an introduction. 1995

work page 1995
[2]

Zettlemoyer

Yoav Artzi and Luke S. Zettlemoyer. Bootstrapping semantic parsers from conversations. In EMNLP, 2011

work page 2011
[3]

Zettlemoyer

Yoav Artzi and Luke S. Zettlemoyer. Weakly supervised learning of semantic parsers for mapping instructions to actions. TACL, 1: 0 49--62, 2013

work page 2013
[4]

A new database on the structure and development of the financial sector

Thorsten Beck, Asli Demirg \"u c -Kunt, and Ross Levine. A new database on the structure and development of the financial sector. The World Bank Economic Review, 14 0 (3): 0 597--605, 2000

work page 2000
[5]

Le, Mohammad Norouzi, and Samy Bengio

Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. Neural combinatorial optimization with reinforcement learning. ICLR, 2017

work page 2017
[6]

Semantic parsing on freebase from question-answer pairs

Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, 2013

work page 2013
[7]

Methods for exploring and mining tables on wikipedia

Chandra Bhagavatula, Thanapon Noraset, and Doug Downey. Methods for exploring and mining tables on wikipedia. In IDEA@KDD, 2013

work page 2013
[8]

Large-scale semantic parsing via schema matching and lexicon extension

Qingqing Cai and Alexander Yates. Large-scale semantic parsing via schema matching and lexicon extension. In ACL, 2013

work page 2013
[9]

Language to logical form with neural attention

Li Dong and Mirella Lapata. Language to logical form with neural attention. ACL, 2016

work page 2016
[10]

Translating questions to SQL queries with generative parsers discriminatively reranked

Alessandra Giordani and Alessandro Moschitti. Translating questions to SQL queries with generative parsers discriminatively reranked. In COLING, 2012

work page 2012
[11]

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. Incorporating copying mechanism in sequence-to-sequence learning. ACL, 2016

work page 2016
[12]

From language to programs: Bridging reinforcement learning and maximum marginal likelihood

Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, and Percy Liang. From language to programs: Bridging reinforcement learning and maximum marginal likelihood. In ACL, 2017

work page 2017
[13]

A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks . arXiv, cs.CL 1611.01587, 2016

work page Pith review arXiv 2016
[14]

Can electronic medical record systems transform health care? potential health benefits, savings, and costs

Richard Hillestad, James Bigelow, Anthony Bower, Federico Girosi, Robin Meili, Richard Scoville, and Roger Taylor. Can electronic medical record systems transform health care? potential health benefits, savings, and costs. Health affairs, 24 0 (5): 0 1103--1117, 2005

work page 2005
[15]

Long short-term memory

Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 1997

work page 1997
[16]

Learning a neural semantic parser from user feedback

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer. Learning a neural semantic parser from user feedback. In ACL, 2017

work page 2017
[17]

Adam: A Method for Stochastic Optimization

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv, abs/1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[18]

Le, Ken Forbus, and Ni Lao

Chen Liang, Jonathan Berant, Quoc V. Le, Ken Forbus, and Ni Lao. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. In ACL, 2017

work page 2017
[19]

Jordan, and Dan Klein

Percy Liang, Michael I. Jordan, and Dan Klein. Learning dependency-based compositional semantics. Computational Linguistics, 39: 0 389--446, 2011

work page 2011
[20]

Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J

Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pp.\ 55--60, 2014

work page 2014
[21]

Pointer sentinel mixture models

Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. Pointer sentinel mixture models. ICLR, 2017

work page 2017
[22]

Coupling distributed and symbolic execution for natural language queries

Lili Mou, Zhengdong Lu, Hang Li, and Zhi Jin. Coupling distributed and symbolic execution for natural language queries. In ICML, 2017

work page 2017
[23]

Le, Mart \'i n Abadi, Andrew McCallum, and Dario Amodei

Arvind Neelakantan, Quoc V. Le, Mart \'i n Abadi, Andrew McCallum, and Dario Amodei. Learning a natural language interface with neural programmer. In ICLR, 2017

work page 2017
[24]

Application of data mining techniques in customer relationship management: A literature review and classification

Eric WT Ngai, Li Xiu, and Dorothy CK Chau. Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications, 36 0 (2): 0 2592--2602, 2009

work page 2009
[25]

Compositional semantic parsing on semi-structured tables

Panupong Pasupat and Percy Liang. Compositional semantic parsing on semi-structured tables. In ACL, 2015

work page 2015
[26]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In EMNLP, 2014

work page 2014
[27]

Towards a theory of natural language interfaces to databases

Ana-Maria Popescu, Oren Etzioni, and Henry Kautz. Towards a theory of natural language interfaces to databases. In Proceedings of the 8th International Conference on Intelligent User Interfaces, pp.\ 149--157. ACM, 2003

work page 2003
[28]

Patti J. Price. Evaluation of spoken language systems: The ATIS domain. 1990

work page 1990
[29]

Large-scale semantic parsing without question-answer pairs

Siva Reddy, Mirella Lapata, and Mark Steedman. Large-scale semantic parsing without question-answer pairs. TACL, 2: 0 377--392, 2014

work page 2014
[30]

Gradient estimation using stochastic computation graphs

John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. Gradient estimation using stochastic computation graphs. In NIPS, 2015

work page 2015
[31]

Bidirectional attention flow for machine comprehension

Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. Bidirectional attention flow for machine comprehension. ICLR, 2017

work page 2017
[32]

Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov

Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15: 0 1929--1958, 2014

work page 1929
[33]

Joint learning of ontology and semantic parser from text

Janez Starc and Dunja Mladenic. Joint learning of ontology and semantic parser from text. Intelligent Data Analysis, 21: 0 19--38, 2017

work page 2017
[34]

Policy gradient methods for reinforcement learning with function approximation

Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems, pp.\ 1057--1063, 2000

work page 2000
[35]

Tang and Raymond J

Lappoon R. Tang and Raymond J. Mooney. Using multiple clause constructors in inductive logic programming for semantic parsing. In ECML, 2001

work page 2001
[36]

Pointer networks

Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. Pointer networks. In NIPS, 2015

work page 2015
[37]

Building a semantic parser overnight

Yushi Wang, Jonathan Berant, and Percy Liang. Building a semantic parser overnight. In ACL, 2015

work page 2015
[38]

Yuk Wah Wong and Raymond J. Mooney. Learning synchronous grammars for semantic parsing with lambda calculus. In ACL, 2007

work page 2007
[39]

Dynamic coattention networks for question answering

Caiming Xiong, Victor Zhong, and Richard Socher. Dynamic coattention networks for question answering. ICLR, 2017

work page 2017
[40]

Neural enquirer: Learning to query tables

Pengcheng Yin, Zhengdong Lu, Hang Li, and Ben Kao. Neural enquirer: Learning to query tables. In ACL, 2016

work page 2016
[41]

Zelle and Raymond J

John M. Zelle and Raymond J. Mooney. Learning to parse database queries using inductive logic programming. In AAAI/IAAI, Vol. 2, 1996

work page 1996
[42]

Zettlemoyer and Michael Collins

Luke S. Zettlemoyer and Michael Collins. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Uncertainty in Artificial Intelligence, 2005

work page 2005
[43]

Zettlemoyer and Michael Collins

Luke S. Zettlemoyer and Michael Collins. Online learning of relaxed ccg grammars for parsing to logical form. In EMNLP-CoNLL, 2007

work page 2007