Recognition: no theorem link
LaMDA: Language Models for Dialog Applications
Pith reviewed 2026-05-12 03:09 UTC · model grok-4.3
The pith
Fine-tuning LaMDA models on annotated human values plus access to external tools markedly raises safety and factual grounding in dialog responses.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LaMDA models achieve stronger safety by routing candidate replies through a classifier trained on annotated examples of human values and achieve stronger factual grounding by consulting external knowledge sources during generation rather than relying solely on internal parameters.
What carries the argument
A safety classifier fine-tuned on crowd-annotated dialog data that filters responses for alignment with selected human values, paired with an external-tool interface that lets the model call information retrieval, translation, or calculation systems to ground its outputs.
If this is right
- Dialog systems can filter outputs for consistency with chosen values before they reach users.
- Models can produce answers that cite or derive from retrieved sources instead of generating from memory alone.
- Targeted fine-tuning and tool access can outperform further scaling for safety and factuality.
- Education and recommendation applications show gains in helpfulness and consistency when these methods are applied.
Where Pith is reading between the lines
- The same fine-tuning and tool-use pattern could be applied to other open-ended generation tasks where alignment and verifiability matter.
- Expanding the set of external tools might allow the model to handle additional reasoning steps not covered by the current three.
- Safety metrics built on a limited illustrative value set leave room for later expansion or crowdsourced refinement.
Load-bearing premise
The selected human values for annotation and the three chosen external tools are sufficient to cover safety and factuality needs across open-ended real-world conversations.
What would settle it
Run the model on prompts involving values outside the annotated set or facts absent from the retrieval, translation, and calculator tools; if the rate of unsafe or ungrounded replies stays as high as in the base model, the claimed improvements do not hold.
read the original abstract
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces LaMDA, a family of Transformer-based language models up to 137B parameters pre-trained on 1.56T words of public dialog data and web text. It claims that scaling improves overall dialog quality but yields limited gains on safety and factual grounding. The authors demonstrate that fine-tuning a classifier on a small set of crowdworker-annotated data to filter unsafe responses, combined with enabling the model to consult external tools (information retrieval, translator, calculator), produces improvements on a safety metric derived from an illustrative set of human values and a groundedness metric. The work also includes qualitative explorations of LaMDA in education and content-recommendation domains.
Significance. If the reported gains hold under broader testing, the paper supplies a practical, scalable recipe for mitigating two persistent limitations of large dialog models. The explicit integration of external knowledge sources rather than sole reliance on parametric memory is a clear methodological contribution that later systems have adopted. The scale of the pre-training corpus and the separation of safety fine-tuning from tool-augmented decoding are additional strengths that provide a concrete baseline for subsequent research.
major comments (2)
- [Safety and factual grounding sections] Safety and factual-grounding sections: the safety metric is defined over an illustrative set of human values and the groundedness metric depends on the fixed trio of external tools. Because both the training signal and the evaluation metric are constructed from the same limited annotation set and tool interfaces, the measured improvements may be artifacts of the chosen scope rather than robust advances on the broader challenges of safety and factual grounding. An out-of-distribution test set or independently sourced value specification is needed to substantiate the central claim.
- [Results sections] Results sections: the manuscript asserts 'significant improvements' yet supplies no error bars, confidence intervals, or statistical significance tests for the safety and groundedness scores. Without these quantities it is impossible to judge whether the observed deltas exceed what could be obtained by alternative fine-tuning regimes or are reliable across random seeds.
minor comments (3)
- [Abstract] Abstract: the phrase 'less improvements' is grammatically imprecise and should be replaced by 'smaller improvements' or 'limited improvements'.
- [Methods and figures] Figure captions and tool-integration diagrams: the description of how tool calls are interleaved with generation is terse; a short pseudocode snippet or expanded caption would improve reproducibility.
- [Related work] Related-work section: several contemporaneous papers on tool-augmented language models and safety fine-tuning are not cited; adding them would better situate the contribution.
Simulated Author's Rebuttal
Thank you for the constructive feedback on our LaMDA manuscript. We respond to each major comment below, providing clarifications and indicating where revisions can be made to address the concerns.
read point-by-point responses
-
Referee: [Safety and factual grounding sections] Safety and factual-grounding sections: the safety metric is defined over an illustrative set of human values and the groundedness metric depends on the fixed trio of external tools. Because both the training signal and the evaluation metric are constructed from the same limited annotation set and tool interfaces, the measured improvements may be artifacts of the chosen scope rather than robust advances on the broader challenges of safety and factual grounding. An out-of-distribution test set or independently sourced value specification is needed to substantiate the central claim.
Authors: The manuscript explicitly describes the safety values as 'illustrative' and the tools as representative examples of external knowledge sources. The improvements demonstrated are specific to this setup, showing that the fine-tuning and tool-use approach can enhance performance on these metrics. We agree that the claims are scoped to the chosen annotations and tools, and we can revise the text to emphasize the illustrative nature and discuss how the framework generalizes to other value sets or tools. However, conducting new out-of-distribution evaluations would require additional crowdworker annotations and experiments not included in the current work. revision: partial
-
Referee: [Results sections] Results sections: the manuscript asserts 'significant improvements' yet supplies no error bars, confidence intervals, or statistical significance tests for the safety and groundedness scores. Without these quantities it is impossible to judge whether the observed deltas exceed what could be obtained by alternative fine-tuning regimes or are reliable across random seeds.
Authors: We acknowledge the absence of statistical measures in the reported results. The evaluations were performed using fixed test sets derived from the annotations, and the improvements are presented as direct comparisons. In a revision, we can include error bars estimated via bootstrap resampling or multiple evaluation runs where applicable, and clarify the evaluation methodology to allow assessment of reliability. This will strengthen the presentation without altering the core findings. revision: yes
- We cannot provide out-of-distribution test sets or independently sourced value specifications, as this would necessitate new data collection efforts beyond the scope of the presented experiments.
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's improvements in safety and factual grounding are demonstrated via fine-tuning on externally sourced crowdworker annotations for an illustrative set of human values and via consultation of independent external tools (IR system, translator, calculator). The safety metric and groundedness metric are defined against these separate annotations and known sources rather than quantities derived from the model's own outputs or fitted parameters. No load-bearing step reduces by construction to self-defined inputs, fitted subsets renamed as predictions, or self-citation chains; the experimental results remain falsifiable against the external benchmarks.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 44 Pith papers
-
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
-
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
AgentDojo introduces an extensible evaluation framework populated with realistic agent tasks and security test cases to measure prompt injection robustness in tool-using LLM agents.
-
MusicLM: Generating Music From Text
MusicLM produces coherent multi-minute 24 kHz music from text prompts using hierarchical sequence-to-sequence modeling and outperforms prior systems in quality and text adherence.
-
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models
dGRPO merges outcome-based policy optimization with dense teacher guidance from on-policy distillation, yielding more stable long-context reasoning on the new LongBlocks synthetic dataset.
-
IRIS: Interpolative R\'enyi Iterative Self-play for Large Language Model Fine-Tuning
IRIS unifies self-play fine-tuning under an interpolative Rényi objective with adaptive alpha scheduling and reports better benchmark scores than baselines while surpassing full supervised fine-tuning with only 13% of...
-
QLoRA: Efficient Finetuning of Quantized LLMs
QLoRA finetunes 4-bit quantized LLMs via LoRA adapters to match full-precision performance while using far less memory, enabling 65B-scale training on single GPUs and producing Guanaco models near ChatGPT level.
-
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
PoT prompting improves numerical reasoning by having language models write programs executed by a computer instead of performing calculations in natural language chains of thought, with an average 12% gain over CoT.
-
A Generalist Agent
Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
-
OPT: Open Pre-trained Transformer Language Models
OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
-
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo models reach new state-of-the-art few-shot results on image and video tasks by bridging frozen vision and language models with cross-attention layers trained on interleaved web-scale data.
-
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
SayCan combines an LLM's high-level semantic knowledge with robot skill value functions to select only feasible actions, enabling completion of abstract natural-language instructions on a real mobile manipulator.
-
Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing
DR-Smoothing introduces a disrupt-then-rectify prompt processing scheme into smoothing defenses, delivering tight theoretical bounds on success probability against both token- and prompt-level jailbreaks.
-
Response Time Enhances Alignment with Heterogeneous Preferences
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
-
A Meta Reinforcement Learning Approach to Goals-Based Wealth Management
MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
-
CleanBase: Detecting Malicious Documents in RAG Knowledge Databases
CleanBase identifies malicious documents in RAG databases by detecting cliques in a semantic similarity graph constructed using embedding models and a statistical threshold.
-
RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models
RaTA-Tool retrieves suitable external tools for multimodal queries by matching generated task descriptions against tool metadata, supported by a new Hugging Face-derived dataset and DPO optimization.
-
ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification
ADAPT is a new pre-training paradigm that aligns physical properties of time-series data to allow simultaneous training on 162 diverse classification datasets, achieving new state-of-the-art performance.
-
Towards an AI co-scientist
A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.
-
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
Grounded SAM integrates Grounding DINO and SAM to support text-prompted open-world detection and segmentation, achieving 48.7 mean AP on SegInW zero-shot with the base detector and huge segmenter.
-
MiniLLM: On-Policy Distillation of Large Language Models
MiniLLM distills large language models into smaller ones via reverse KL divergence and on-policy optimization, yielding higher-quality responses with lower exposure bias than standard KD baselines.
-
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Properly filtered web data from CommonCrawl alone trains LLMs that significantly outperform models trained on The Pile, with 600 billion tokens and 1.3B/7.5B parameter models released.
-
Gorilla: Large Language Model Connected with Massive APIs
Gorilla is a fine-tuned LLM that surpasses GPT-4 in accurate API call generation and uses retrieval to handle documentation updates.
-
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Multiagent debate among LLMs improves mathematical reasoning, strategic reasoning, and factual accuracy while reducing hallucinations.
-
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
CAMEL proposes a role-playing framework with inception prompting that enables autonomous multi-agent cooperation among LLMs and generates conversational data for studying their behaviors.
-
BloombergGPT: A Large Language Model for Finance
BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.
-
PaLM-E: An Embodied Multimodal Language Model
PaLM-E is a single 562B-parameter multimodal model that performs embodied reasoning tasks like robotic manipulation planning and visual question answering by interleaving vision, state, and text inputs with positive t...
-
Multimodal Chain-of-Thought Reasoning in Language Models
Multimodal-CoT achieves state-of-the-art on ScienceQA by using a two-stage process that incorporates vision into chain-of-thought rationale generation for models under 1 billion parameters.
-
Improving alignment of dialogue agents via targeted human judgements
Sparrow uses targeted rule-based human feedback and evidence provision to outperform baselines in preference while violating rules only 8% of the time under adversarial probing.
-
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
RLHF-aligned language models show increasing resistance to red teaming with scale up to 52B parameters, unlike prompted or rejection-sampled models, supported by a released dataset of 38,961 attacks.
-
Language Models (Mostly) Know What They Know
Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
-
Emergent Abilities of Large Language Models
Emergent abilities are capabilities present in large language models but absent in smaller ones and cannot be predicted by extrapolating smaller model performance.
-
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
RLHF alignment training on language models boosts NLP performance, supports skill specialization, enables weekly online updates with fresh human data, and shows a linear relation between RL reward and sqrt(KL divergen...
-
PaLM: Scaling Language Modeling with Pathways
PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.
-
Fast NF4 Dequantization Kernels for Large Language Model Inference
A lightweight shared-memory technique for NF4 dequantization kernels yields 2.0-2.2x kernel speedup and 1.54x end-to-end gains on models up to 70B parameters while using only 64 bytes of shared memory per block.
-
PaLM 2 Technical Report
PaLM 2 reports state-of-the-art results on language, reasoning, and multilingual tasks with improved efficiency over PaLM.
-
StarCoder: may the source be with you!
StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
-
Constitutional AI: Harmlessness from AI Feedback
Pith review generated a malformed one-line summary.
-
Galactica: A Large Language Model for Science
Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.
-
Useful for Exploration, Risky for Precision: Evaluating AI Tools in Academic Research
AI Q&A tools give useful overviews but fail at precise information extraction and source tracing, while literature review tools aid exploration yet lack reproducibility and transparency, making them unsuitable for sys...
-
Useful for Exploration, Risky for Precision: Evaluating AI Tools in Academic Research
AI tools deliver useful overviews for research exploration but prove unreliable for precise information extraction and systematic reviews due to low explainability, reproducibility, and transparency.
-
On The Application of Linear Attention in Multimodal Transformers
Linear attention delivers significant computational savings in multimodal transformers and follows the same scaling laws as softmax attention on ViT models trained on LAION-400M with ImageNet-21K zero-shot validation.
-
Large Language Models: A Survey
The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.
-
A Survey of Large Language Models
This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.
Reference graph
Works this paper leans on
-
[1]
Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Skip-thought vectors. In Advances in Neural Information Processing Systems, pages 3294–3302, 2015
work page 2015
-
[2]
Semi-supervised sequence learning
Andrew M Dai and Quoc V Le. Semi-supervised sequence learning. In Advances in Neural Information Processing Systems, 2015
work page 2015
-
[3]
Deep contextualized word representations
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettle- moyer. Deep contextualized word representations. In NAACL, 2018
work page 2018
-
[5]
Improving language understanding by generative pre-training
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. Improving language understanding by generative pre-training. https://blog.openai.com/language-unsupervised, 2018
work page 2018
-
[6]
BERT: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019
work page 2019
-
[7]
XLNet: Generalized autoregressive pretraining for language understanding
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. XLNet: Generalized autoregressive pretraining for language understanding. In NeurIPS, 2019
work page 2019
-
[8]
Albert: A lite bert for self-supervised learning of language representations
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. Albert: A lite bert for self-supervised learning of language representations. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=H1eA7AEtvS
work page 2020
-
[9]
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach.arXiv preprint arXiv:1907.11692, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[10]
Kevin Clark, Minh-Thang Luong, Quoc V . Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR, 2020
work page 2020
-
[11]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020
work page 2020
-
[12]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwi...
work page 2020
-
[13]
Scaling Laws for Neural Language Models
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2001
-
[14]
Neural responding machine for short-text conversation
Lifeng Shang, Zhengdong Lu, and Hang Li. Neural responding machine for short-text conversation. In ACL, 2015. 19
work page 2015
-
[15]
A neural network approach to context-sensitive generation of conversational responses
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. A neural network approach to context-sensitive generation of conversational responses. arXiv preprint arXiv:1506.06714, 2015
-
[16]
Oriol Vinyals and Quoc V . Le. A neural conversational model. In ICML Workshop, 2015
work page 2015
-
[17]
arXiv preprint arXiv:2001.09977 , year=
Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, and Quoc V . Le. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977, 2020
-
[18]
Smith, Y-Lan Boureau, and Jason Weston
Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, and Jason Weston. Recipes for building an open-domain chatbot. arXiv preprint arXiv:2004.13637, 2020
-
[19]
Recurrent neural network based language model
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernock`y, and Sanjeev Khudanpur. Recurrent neural network based language model. In INTERSPEECH, 2010
work page 2010
-
[20]
Generating text with recurrent neural networks
Ilya Sutskever, James Martens, and Geoffrey E Hinton. Generating text with recurrent neural networks. In ICML, 2011
work page 2011
-
[21]
Exploring the Limits of Language Modeling
Rafal Józefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016
work page Pith review arXiv 2016
-
[22]
Universal language model fine-tuning for text classification
Jeremy Howard and Sebastian Ruder. Universal language model fine-tuning for text classification. InACL, 2018
work page 2018
-
[23]
Unsupervised representation learning with deep convolutional generative adversarial networks
Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR, 2016
work page 2016
-
[24]
Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, ...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[25]
Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander H. Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. Black, Alexander I. Rudnicky, Jason Williams, Joelle Pineau, Mikhail S. Burtsev, and Jason Weston. The second conversational intelligence challenge (convai2). The NeurIPS ’18 Com...
work page 2020
-
[26]
Personalizing dialogue agents: I have a dog, do you have pets too? ACL, 2018
Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. Personalizing dialogue agents: I have a dog, do you have pets too? ACL, 2018
work page 2018
-
[27]
A Diversity-Promoting Objective Function for Neural Conversation Models
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015
work page Pith review arXiv 2015
-
[28]
Generative deep neural networks for dialogue: A short review
Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, and Joelle Pineau. Generative deep neural networks for dialogue: A short review. arXiv preprint arXiv:1611.06216, 2016
-
[29]
Transfertransfo: A transfer learning approach for neural network based conversational agents
Thomas Wolf, Victor Sanh, Julien Chaumond, and Clement Delangue. Transfertransfo: A transfer learning approach for neural network based conversational agents. In NeurIPS Workshop on Conversational AI, 2019
work page 2019
-
[30]
Dialogpt: Large-scale generative pre-training for conversational response generation
Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, and Bill Dolan. Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536, 2019
-
[31]
Retrieval augmentation reduces hallucination in conversation
Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, and Jason Weston. Retrieval augmentation reduces hallucination in conversation. arXiv preprint arXiv:2104.07567, 2021
-
[32]
Adam Roberts, Colin Raffel, and Noam Shazeer. How much knowledge can you pack into the parameters of a language model? In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 5418–5426, November 2020
work page 2020
-
[33]
Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathy Meier-Hellstern, Toju Duke, Lucas Dixon, 20 Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, a...
work page 2021
-
[34]
arXiv preprint arXiv:1911.00172 , year=
Urvashi Khandelwal, Omer Levy, Dan Jurafsk, Luke Zettlemoyer, and Mike Lewis. Generalization through memorization: Nearest neighbor language models. arXiv preprint arXiv:1911.00172, 2019
-
[35]
Retrieval-augmented generation for knowledge-intensive nlp tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. NeurIPS, 2020
work page 2020
-
[36]
doi:10.48550/arXiv.2002.08909 , abstract =
Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. Realm: Retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909, 2020
-
[37]
Leveraging passage retrieval with generative models for open domain question answering
Gautier Izacard and Edouard Grave. Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282, 2021
-
[38]
Retrieving and reading: A comprehensive survey on open-domain question answering
Fengbin Zhu, Wenqiang Lei, Chao Wang, Jianming Zheng, Soujanya Poria, and Tat-Seng Chua. Retrieving and reading: A comprehensive survey on open-domain question answering. arXiv preprint arXiv:2101.00774, 2021
-
[39]
arXiv preprint arXiv:2004.04906 , year=
Vladimir Karpukhin, Barlas O˘guz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen tau Yih. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020
-
[40]
A modern perspective on query likelihood with deep generative retrieval models
Oleg Lesota, Navid Rekabsaz, Daniel Cohen, Klaus Antonius Grasserbauer, Carsten Eickhoff, and Markus Schedl. A modern perspective on query likelihood with deep generative retrieval models. arXiv preprint arXiv:2106.13618, 2021
-
[41]
Improving language models by retrieving from trillions of tokens.Preprint arXiv:2112.04426,
Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Ori...
-
[42]
Tickettalk: Toward human-level performance with end-to-end, transaction-based dialog systems
Bill Byrne, Karthik Krishnamoorthi, Saravanan Ganesh, and Mihir Sanjay Kale. Tickettalk: Toward human-level performance with end-to-end, transaction-based dialog systems. arXiv preprint arXiv:2012.12458, 2020
-
[43]
Reason first, then respond: Modular generation for knowledge-infused dialogue
Leonard Adolphs, Kurt Shuster, Jack Urbanek, Arthur Szlam, and Jason Weston. Reason first, then respond: Modular generation for knowledge-infused dialogue. arXiv preprint arXiv:2111.05204, 2021
-
[44]
WebGPT: Browser-assisted question-answering with human feedback
Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, and John Schulman. Webgpt: Browser-assisted question- answering with human feedback. arXiv preprint arX...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[45]
Internet-augmented dialogue generation
Mojtaba Komeili, Kurt Shuster, and Jason Weston. Internet-augmented dialogue generation. arXiv preprint arXiv:2107.07566, 2021
-
[46]
Usr: An unsupervised and reference free evaluation metric for dialog generation
Shikib Mehri and Maxine Eskenazi. Usr: An unsupervised and reference free evaluation metric for dialog generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 681–707, 2020
work page 2020
-
[47]
BLEU: a method for automatic evaluation of machine translation
Kishore Papineni, Salim Roukos, Todd Ward, and Wei jing Zhu. BLEU: a method for automatic evaluation of machine translation. In ACL, 2002
work page 2002
-
[48]
Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, and Joelle Pineau. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing , 2016
work page 2016
-
[49]
What makes a good conversation? how controllable attributes affect human judgments
Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. What makes a good conversation? how controllable attributes affect human judgments. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, 2019
work page 2019
-
[50]
Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons
Margaret Li, Jason Weston, and Stephen Roller. Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons. In NeurIPS workshop on Conversational AI, 2019
work page 2019
-
[51]
Treating dialogue quality evaluation as an anomaly detection problem
Rostislav Nedelchev, Jens Lehmann, and Ricardo Usbeck. Treating dialogue quality evaluation as an anomaly detection problem. In Proceedings of the 12th Conference on Language Resources and Evaluation , pages 508–512, 2020
work page 2020
-
[52]
On evaluating and comparing conversational agents
Anu Venkatesh, Chandra Khatri, Ashwin Ram, Fenfei Guo, Raefer Gabriel, Ashish Nagar, Rohit Prasad, Ming Cheng, Behnam Hedayatnia, Angeliki Metallinou, Rahul Goel, Shaohua Yang, and Anirudh Raju. On evaluating and comparing conversational agents. NeurIPS, 2017. 21
work page 2017
-
[53]
Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, and Verena Rieser
Emily Dinan, Gavin Abercrombie, A. Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, and Verena Rieser. Anticipating safety issues in e2e conversational ai: Framework and tooling. arXiv preprint arXiv:2107.03451, 2021
-
[54]
Ethical and social risks of harm from Language Models
Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, and Iason Gabriel. Ethical and ...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[55]
Samuel Rota Bulò, Lorenzo Porzi, and Peter Kontschieder. Dropout distillation. In ICLR, 2016
work page 2016
-
[56]
The radicalization risks of GPT-3 and advanced neural language models
Kris McGuffie and Alex Newhouse. The radicalization risks of GPT-3 and advanced neural language models. arXiv preprint arXiv:2009.06807, 2020
-
[57]
Persistent anti-muslim bias in large language models
Abubakar Abid, Maheen Farooqi, and James Zou. Persistent anti-muslim bias in large language models. arXiv preprint arXiv:2101.05783, 2021
-
[58]
Man is to computer programmer as woman is to homemaker? debiasing word embeddings
Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, 2016
work page 2016
-
[59]
Christine Basta, Marta R. Costa-jussà, and Noe Casas. Evaluating the underlying gender bias in contextualized word embeddings. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing , August 2019
work page 2019
-
[60]
Measuring bias in contextualized word representations
Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yulia Tsvetkov. Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, August 2019
work page 2019
-
[61]
Lu, Mohamed Abdalla, Matthew McDermott, and Marzyeh Ghassemi
Haoran Zhang, Amy X. Lu, Mohamed Abdalla, Matthew McDermott, and Marzyeh Ghassemi. Hurtful words: Quantifying biases in clinical contextual word embeddings. In Proceedings of the ACM Conference on Health, Inference, and Learning, 2020
work page 2020
-
[62]
The woman worked as a babysitter: On biases in language generation
Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. The woman worked as a babysitter: On biases in language generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , 2019
work page 2019
-
[63]
Gender bias in contextualized word embeddings
Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, and Kai-Wei Chang. Gender bias in contextualized word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), June 2019
work page 2019
-
[64]
Wei Guo and Aylin Caliskan. Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases. arXiv preprint arXiv:2006.03955, 2020
-
[65]
Perturbation sensitivity analysis to detect unintended model biases
Vinodkumar Prabhakaran, Ben Hutchinson, and Margaret Mitchell. Perturbation sensitivity analysis to detect unintended model biases. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2019
work page 2019
-
[66]
Bowman and Rachel Rudinger , title =
Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. On measuring social biases in sentence encoders. arXiv preprint arXiv:1903.10561, 2019
-
[67]
Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019
work page 2019
-
[68]
Shikha Bordia and Samuel R. Bowman. Identifying and reducing gender bias in word-level language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, 2019
work page 2019
-
[69]
Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 2021
work page 2021
-
[70]
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, and Yejin Choi. Social bias frames: Reasoning about social and power implications of language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
work page 2020
-
[71]
Social biases in NLP models as barriers for persons with disabilities
Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, and Stephen Denuyl. Social biases in NLP models as barriers for persons with disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
work page 2020
-
[72]
Large language models associate muslims with violence
Abubakar Abid, Maheen Farooqi, and James Zou. Large language models associate muslims with violence. Nature Machine Intelligence, 2021. 22
work page 2021
-
[73]
Extracting training data from large language models
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. arXiv preprint arXiv:2012.07805, 2020
-
[74]
Sahaj Garg, Vincent Perot, Nicole Limtiaco, Ankur Taly, Ed H. Chi, and Alex Beutel. Counterfactual fairness in text classification through robustness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019. ISBN 9781450363242
work page 2019
-
[75]
Reducing sentiment bias in language models via counterfactual evaluation
Po-Sen Huang, Huan Zhang, Ray Jiang, Robert Stanforth, Johannes Welbl, Jack Rae, Vishal Maini, Dani Yogatama, and Pushmeet Kohli. Reducing sentiment bias in language models via counterfactual evaluation. In EMNLP (Findings), 2020
work page 2020
-
[76]
A scalable approach to reducing gender bias in google translate
Melvin Johnson. A scalable approach to reducing gender bias in google translate. https://ai.googleblog. com/2020/04/a-scalable-approach-to-reducing-gender.html , 2020
work page 2020
-
[77]
Reducing gender bias in word-level language models with a gender-equalizing loss function
Yusu Qian, Urwa Muaz, Ben Zhang, and Jae Won Hyun. Reducing gender bias in word-level language models with a gender-equalizing loss function. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, July 2019
work page 2019
-
[78]
Towards debiasing sentence representations
Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, and Louis-Philippe Morency. Towards debiasing sentence representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020
work page 2020
-
[79]
Recipes for safety in open-domain chatbots
Margaret Li Y-Lan Boureau Jason Weston Emily Dinan Jing Xu, Da Ju. Recipes for safety in open-domain chatbots. arXiv preprint arXiv:2010.07079, 2020
-
[80]
Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, and Yejin Choi. On-the-fly controlled text generation with experts and anti-experts. arXiv preprint arXiv:2105.03023, 2021
-
[81]
Bot-adversarial dialogue for safe conversational agents
Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, and Emily Dinan. Bot-adversarial dialogue for safe conversational agents. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.