arxiv: 2201.08239 · v3 · submitted 2022-01-20 · 💻 cs.CL · cs.AI

Recognition: no theorem link

LaMDA: Language Models for Dialog Applications

Romal Thoppilan , Daniel De Freitas , Jamie Hall , Noam Shazeer , Apoorv Kulshreshtha , Heng-Tze Cheng , Alicia Jin , Taylor Bos

show 52 more authors

Leslie Baker Yu Du Yaguang Li Hongrae Lee Huaixiu Steven Zheng Amin Ghafouri Marcelo Menegali Yanping Huang Maxim Krikun Dmitry Lepikhin James Qin Dehao Chen Yuanzhong Xu Zhifeng Chen Adam Roberts Maarten Bosma Vincent Zhao Yanqi Zhou Chung-Ching Chang Igor Krivokon Will Rusch Marc Pickett Pranesh Srinivasan Laichee Man Kathleen Meier-Hellstern Meredith Ringel Morris Tulsee Doshi Renelito Delos Santos Toju Duke Johnny Soraker Ben Zevenbergen Vinodkumar Prabhakaran Mark Diaz Ben Hutchinson Kristen Olson Alejandra Molina Erin Hoffman-John Josh Lee Lora Aroyo Ravi Rajakumar Alena Butryna Matthew Lamm Viktoriya Kuzmina Joe Fenton Aaron Cohen Rachel Bernstein Ray Kurzweil Blaise Aguera-Arcas Claire Cui Marian Croak Ed Chi Quoc Le

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:09 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords language modelsdialog systemssafetyfactual groundingfine-tuningexternal knowledgetransformer models

0 comments

The pith

Fine-tuning LaMDA models on annotated human values plus access to external tools markedly raises safety and factual grounding in dialog responses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

LaMDA consists of large transformer language models pre-trained on public dialog and web text for conversational use. Scaling the models alone lifts overall response quality but leaves safety and factuality largely unchanged. The work demonstrates that fine-tuning on a modest set of crowdworker annotations for values such as avoiding harm and bias, combined with an interface allowing the model to query information retrieval, translation, and calculation tools, produces measurable gains on both challenges. These gains are quantified with a safety classifier that filters candidate replies and a groundedness metric that checks whether answers rest on verifiable sources rather than plausible invention. The approach is further tested in education and recommendation settings for helpfulness and role consistency.

Core claim

LaMDA models achieve stronger safety by routing candidate replies through a classifier trained on annotated examples of human values and achieve stronger factual grounding by consulting external knowledge sources during generation rather than relying solely on internal parameters.

What carries the argument

A safety classifier fine-tuned on crowd-annotated dialog data that filters responses for alignment with selected human values, paired with an external-tool interface that lets the model call information retrieval, translation, or calculation systems to ground its outputs.

If this is right

Dialog systems can filter outputs for consistency with chosen values before they reach users.
Models can produce answers that cite or derive from retrieved sources instead of generating from memory alone.
Targeted fine-tuning and tool access can outperform further scaling for safety and factuality.
Education and recommendation applications show gains in helpfulness and consistency when these methods are applied.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fine-tuning and tool-use pattern could be applied to other open-ended generation tasks where alignment and verifiability matter.
Expanding the set of external tools might allow the model to handle additional reasoning steps not covered by the current three.
Safety metrics built on a limited illustrative value set leave room for later expansion or crowdsourced refinement.

Load-bearing premise

The selected human values for annotation and the three chosen external tools are sufficient to cover safety and factuality needs across open-ended real-world conversations.

What would settle it

Run the model on prompts involving values outside the annotated set or facts absent from the retrieval, translation, and calculator tools; if the rate of unsafe or ungrounded replies stays as high as in the base model, the claimed improvements do not hold.

read the original abstract

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LaMDA shows that fine-tuning on safety annotations plus tool access improves dialog safety and grounding, but only on a narrow illustrative set of values and a basic trio of tools.

read the letter

The main point here is that LaMDA shows how to get better safety and factual grounding in large dialog models by fine-tuning on annotated data and giving the model access to external tools, though the improvements are shown only for a limited set of values and tools. The new part is the LaMDA family itself, pre-trained on dialog data, along with the safety classifier trained on crowdworker annotations and the groundedness metric that checks if responses are based on tool outputs rather than just plausible text. They also test the model in education and recommendation scenarios. This is useful because it moves beyond pure scaling and gives concrete mechanisms for two problems that matter for real applications. The soft spots are in the scope. The safety work uses an illustrative set of human values, and the tools are limited to information retrieval, a translator, and a calculator. As the stress test points out, nothing ensures these cover the full range of safety issues or knowledge requirements in open dialogs. The abstract gives no quantitative results or baselines, which makes it hard to judge the actual size of the gains or their reliability. If the full paper has solid numbers and more details, that would help. This is for researchers focused on making language models work in interactive settings, especially around safety and reliability. Readers who want practical ideas for tool integration or annotation-based fine-tuning will find it worthwhile. It deserves a serious referee because it introduces a new model line with targeted fixes for known issues in dialog systems. I would recommend sending it to peer review.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces LaMDA, a family of Transformer-based language models up to 137B parameters pre-trained on 1.56T words of public dialog data and web text. It claims that scaling improves overall dialog quality but yields limited gains on safety and factual grounding. The authors demonstrate that fine-tuning a classifier on a small set of crowdworker-annotated data to filter unsafe responses, combined with enabling the model to consult external tools (information retrieval, translator, calculator), produces improvements on a safety metric derived from an illustrative set of human values and a groundedness metric. The work also includes qualitative explorations of LaMDA in education and content-recommendation domains.

Significance. If the reported gains hold under broader testing, the paper supplies a practical, scalable recipe for mitigating two persistent limitations of large dialog models. The explicit integration of external knowledge sources rather than sole reliance on parametric memory is a clear methodological contribution that later systems have adopted. The scale of the pre-training corpus and the separation of safety fine-tuning from tool-augmented decoding are additional strengths that provide a concrete baseline for subsequent research.

major comments (2)

[Safety and factual grounding sections] Safety and factual-grounding sections: the safety metric is defined over an illustrative set of human values and the groundedness metric depends on the fixed trio of external tools. Because both the training signal and the evaluation metric are constructed from the same limited annotation set and tool interfaces, the measured improvements may be artifacts of the chosen scope rather than robust advances on the broader challenges of safety and factual grounding. An out-of-distribution test set or independently sourced value specification is needed to substantiate the central claim.
[Results sections] Results sections: the manuscript asserts 'significant improvements' yet supplies no error bars, confidence intervals, or statistical significance tests for the safety and groundedness scores. Without these quantities it is impossible to judge whether the observed deltas exceed what could be obtained by alternative fine-tuning regimes or are reliable across random seeds.

minor comments (3)

[Abstract] Abstract: the phrase 'less improvements' is grammatically imprecise and should be replaced by 'smaller improvements' or 'limited improvements'.
[Methods and figures] Figure captions and tool-integration diagrams: the description of how tool calls are interleaved with generation is terse; a short pseudocode snippet or expanded caption would improve reproducibility.
[Related work] Related-work section: several contemporaneous papers on tool-augmented language models and safety fine-tuning are not cited; adding them would better situate the contribution.

Simulated Author's Rebuttal

2 responses · 1 unresolved

Thank you for the constructive feedback on our LaMDA manuscript. We respond to each major comment below, providing clarifications and indicating where revisions can be made to address the concerns.

read point-by-point responses

Referee: [Safety and factual grounding sections] Safety and factual-grounding sections: the safety metric is defined over an illustrative set of human values and the groundedness metric depends on the fixed trio of external tools. Because both the training signal and the evaluation metric are constructed from the same limited annotation set and tool interfaces, the measured improvements may be artifacts of the chosen scope rather than robust advances on the broader challenges of safety and factual grounding. An out-of-distribution test set or independently sourced value specification is needed to substantiate the central claim.

Authors: The manuscript explicitly describes the safety values as 'illustrative' and the tools as representative examples of external knowledge sources. The improvements demonstrated are specific to this setup, showing that the fine-tuning and tool-use approach can enhance performance on these metrics. We agree that the claims are scoped to the chosen annotations and tools, and we can revise the text to emphasize the illustrative nature and discuss how the framework generalizes to other value sets or tools. However, conducting new out-of-distribution evaluations would require additional crowdworker annotations and experiments not included in the current work. revision: partial
Referee: [Results sections] Results sections: the manuscript asserts 'significant improvements' yet supplies no error bars, confidence intervals, or statistical significance tests for the safety and groundedness scores. Without these quantities it is impossible to judge whether the observed deltas exceed what could be obtained by alternative fine-tuning regimes or are reliable across random seeds.

Authors: We acknowledge the absence of statistical measures in the reported results. The evaluations were performed using fixed test sets derived from the annotations, and the improvements are presented as direct comparisons. In a revision, we can include error bars estimated via bootstrap resampling or multiple evaluation runs where applicable, and clarify the evaluation methodology to allow assessment of reliability. This will strengthen the presentation without altering the core findings. revision: yes

standing simulated objections not resolved

We cannot provide out-of-distribution test sets or independently sourced value specifications, as this would necessitate new data collection efforts beyond the scope of the presented experiments.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's improvements in safety and factual grounding are demonstrated via fine-tuning on externally sourced crowdworker annotations for an illustrative set of human values and via consultation of independent external tools (IR system, translator, calculator). The safety metric and groundedness metric are defined against these separate annotations and known sources rather than quantities derived from the model's own outputs or fitted parameters. No load-bearing step reduces by construction to self-defined inputs, fitted subsets renamed as predictions, or self-citation chains; the experimental results remain falsifiable against the external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract contains no explicit mathematical derivations, free parameters, or invented entities; the work is framed as empirical engineering on top of standard transformer pre-training.

pith-pipeline@v0.9.0 · 5794 in / 1273 out tokens · 73830 ms · 2026-05-12T03:09:49.108188+00:00 · methodology

discussion (0)

Forward citations

Cited by 44 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
cs.CL 2022-01 accept novelty 9.0

Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
cs.CR 2024-06 unverdicted novelty 8.0

AgentDojo introduces an extensible evaluation framework populated with realistic agent tasks and security test cases to measure prompt injection robustness in tool-using LLM agents.
MusicLM: Generating Music From Text
cs.SD 2023-01 conditional novelty 8.0

MusicLM produces coherent multi-minute 24 kHz music from text prompts using hierarchical sequence-to-sequence modeling and outperforms prior systems in quality and text adherence.
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models
cs.CL 2026-05 unverdicted novelty 7.0

dGRPO merges outcome-based policy optimization with dense teacher guidance from on-policy distillation, yielding more stable long-context reasoning on the new LongBlocks synthetic dataset.
IRIS: Interpolative R\'enyi Iterative Self-play for Large Language Model Fine-Tuning
cs.LG 2026-04 unverdicted novelty 7.0

IRIS unifies self-play fine-tuning under an interpolative Rényi objective with adaptive alpha scheduling and reports better benchmark scores than baselines while surpassing full supervised fine-tuning with only 13% of...
QLoRA: Efficient Finetuning of Quantized LLMs
cs.LG 2023-05 conditional novelty 7.0

QLoRA finetunes 4-bit quantized LLMs via LoRA adapters to match full-precision performance while using far less memory, enabling 65B-scale training on single GPUs and producing Guanaco models near ChatGPT level.
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
cs.CL 2022-11 unverdicted novelty 7.0

PoT prompting improves numerical reasoning by having language models write programs executed by a computer instead of performing calculations in natural language chains of thought, with an average 12% gain over CoT.
A Generalist Agent
cs.AI 2022-05 accept novelty 7.0

Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
OPT: Open Pre-trained Transformer Language Models
cs.CL 2022-05 unverdicted novelty 7.0

OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
Flamingo: a Visual Language Model for Few-Shot Learning
cs.CV 2022-04 unverdicted novelty 7.0

Flamingo models reach new state-of-the-art few-shot results on image and video tasks by bridging frozen vision and language models with cross-attention layers trained on interleaved web-scale data.
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
cs.RO 2022-04 accept novelty 7.0

SayCan combines an LLM's high-level semantic knowledge with robot skill value functions to select only feasible actions, enabling completion of abstract natural-language instructions on a real mobile manipulator.
Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing
cs.CR 2026-05 unverdicted novelty 6.0

DR-Smoothing introduces a disrupt-then-rectify prompt processing scheme into smoothing defenses, delivering tight theoretical bounds on success probability against both token- and prompt-level jailbreaks.
Response Time Enhances Alignment with Heterogeneous Preferences
cs.LG 2026-05 unverdicted novelty 6.0

Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
A Meta Reinforcement Learning Approach to Goals-Based Wealth Management
cs.LG 2026-05 unverdicted novelty 6.0

MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
CleanBase: Detecting Malicious Documents in RAG Knowledge Databases
cs.CR 2026-05 unverdicted novelty 6.0

CleanBase identifies malicious documents in RAG databases by detecting cliques in a semantic similarity graph constructed using embedding models and a statistical threshold.
RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models
cs.CV 2026-04 unverdicted novelty 6.0

RaTA-Tool retrieves suitable external tools for multimodal queries by matching generated task descriptions against tool metadata, supported by a new Hugging Face-derived dataset and DPO optimization.
ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification
cs.LG 2026-04 unverdicted novelty 6.0

ADAPT is a new pre-training paradigm that aligns physical properties of time-series data to allow simultaneous training on 162 diverse classification datasets, achieving new state-of-the-art performance.
Towards an AI co-scientist
cs.AI 2025-02 unverdicted novelty 6.0

A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
cs.CV 2024-01 unverdicted novelty 6.0

Grounded SAM integrates Grounding DINO and SAM to support text-prompted open-world detection and segmentation, achieving 48.7 mean AP on SegInW zero-shot with the base detector and huge segmenter.
MiniLLM: On-Policy Distillation of Large Language Models
cs.CL 2023-06 conditional novelty 6.0

MiniLLM distills large language models into smaller ones via reverse KL divergence and on-policy optimization, yielding higher-quality responses with lower exposure bias than standard KD baselines.
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
cs.CL 2023-06 unverdicted novelty 6.0

Properly filtered web data from CommonCrawl alone trains LLMs that significantly outperform models trained on The Pile, with 600 billion tokens and 1.3B/7.5B parameter models released.
Gorilla: Large Language Model Connected with Massive APIs
cs.CL 2023-05 conditional novelty 6.0

Gorilla is a fine-tuned LLM that surpasses GPT-4 in accurate API call generation and uses retrieval to handle documentation updates.
Improving Factuality and Reasoning in Language Models through Multiagent Debate
cs.CL 2023-05 unverdicted novelty 6.0

Multiagent debate among LLMs improves mathematical reasoning, strategic reasoning, and factual accuracy while reducing hallucinations.
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
cs.AI 2023-03 conditional novelty 6.0

CAMEL proposes a role-playing framework with inception prompting that enables autonomous multi-agent cooperation among LLMs and generates conversational data for studying their behaviors.
BloombergGPT: A Large Language Model for Finance
cs.LG 2023-03 conditional novelty 6.0

BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.
PaLM-E: An Embodied Multimodal Language Model
cs.LG 2023-03 conditional novelty 6.0

PaLM-E is a single 562B-parameter multimodal model that performs embodied reasoning tasks like robotic manipulation planning and visual question answering by interleaving vision, state, and text inputs with positive t...
Multimodal Chain-of-Thought Reasoning in Language Models
cs.CL 2023-02 accept novelty 6.0

Multimodal-CoT achieves state-of-the-art on ScienceQA by using a two-stage process that incorporates vision into chain-of-thought rationale generation for models under 1 billion parameters.
Improving alignment of dialogue agents via targeted human judgements
cs.LG 2022-09 unverdicted novelty 6.0

Sparrow uses targeted rule-based human feedback and evidence provision to outperform baselines in preference while violating rules only 8% of the time under adversarial probing.
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
cs.CL 2022-08 accept novelty 6.0

RLHF-aligned language models show increasing resistance to red teaming with scale up to 52B parameters, unlike prompted or rejection-sampled models, supported by a released dataset of 38,961 attacks.
Language Models (Mostly) Know What They Know
cs.CL 2022-07 unverdicted novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
cs.CV 2022-06 unverdicted novelty 6.0

Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
Emergent Abilities of Large Language Models
cs.CL 2022-06 unverdicted novelty 6.0

Emergent abilities are capabilities present in large language models but absent in smaller ones and cannot be predicted by extrapolating smaller model performance.
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
cs.CL 2022-04 unverdicted novelty 6.0

RLHF alignment training on language models boosts NLP performance, supports skill specialization, enables weekly online updates with fresh human data, and shows a linear relation between RL reward and sqrt(KL divergen...
PaLM: Scaling Language Modeling with Pathways
cs.CL 2022-04 accept novelty 6.0

PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.
Fast NF4 Dequantization Kernels for Large Language Model Inference
cs.LG 2026-04 unverdicted novelty 5.0

A lightweight shared-memory technique for NF4 dequantization kernels yields 2.0-2.2x kernel speedup and 1.54x end-to-end gains on models up to 70B parameters while using only 64 bytes of shared memory per block.
PaLM 2 Technical Report
cs.CL 2023-05 unverdicted novelty 5.0

PaLM 2 reports state-of-the-art results on language, reasoning, and multilingual tasks with improved efficiency over PaLM.
StarCoder: may the source be with you!
cs.CL 2023-05 accept novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
Constitutional AI: Harmlessness from AI Feedback
cs.CL 2022-12 unverdicted novelty 5.0

Pith review generated a malformed one-line summary.
Galactica: A Large Language Model for Science
cs.CL 2022-11 unverdicted novelty 5.0

Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.
Useful for Exploration, Risky for Precision: Evaluating AI Tools in Academic Research
cs.AI 2026-05 unverdicted novelty 4.0

AI Q&A tools give useful overviews but fail at precise information extraction and source tracing, while literature review tools aid exploration yet lack reproducibility and transparency, making them unsuitable for sys...
Useful for Exploration, Risky for Precision: Evaluating AI Tools in Academic Research
cs.AI 2026-05 unverdicted novelty 4.0

AI tools deliver useful overviews for research exploration but prove unreliable for precise information extraction and systematic reviews due to low explainability, reproducibility, and transparency.
On The Application of Linear Attention in Multimodal Transformers
cs.CV 2026-04 unverdicted novelty 4.0

Linear attention delivers significant computational savings in multimodal transformers and follows the same scaling laws as softmax attention on ViT models trained on LAION-400M with ImageNet-21K zero-shot validation.
Large Language Models: A Survey
cs.CL 2024-02 accept novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.
A Survey of Large Language Models
cs.CL 2023-03 accept novelty 3.0

This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.

Reference graph

Works this paper leans on

120 extracted references · 120 canonical work pages · cited by 43 Pith papers · 8 internal anchors

[1]

Skip-thought vectors

Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Skip-thought vectors. In Advances in Neural Information Processing Systems, pages 3294–3302, 2015

work page 2015
[2]

Semi-supervised sequence learning

Andrew M Dai and Quoc V Le. Semi-supervised sequence learning. In Advances in Neural Information Processing Systems, 2015

work page 2015
[3]

Deep contextualized word representations

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettle- moyer. Deep contextualized word representations. In NAACL, 2018

work page 2018
[5]

Improving language understanding by generative pre-training

Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. Improving language understanding by generative pre-training. https://blog.openai.com/language-unsupervised, 2018

work page 2018
[6]

BERT: Pre-training of deep bidirectional transformers for language understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019

work page 2019
[7]

XLNet: Generalized autoregressive pretraining for language understanding

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. XLNet: Generalized autoregressive pretraining for language understanding. In NeurIPS, 2019

work page 2019
[8]

Albert: A lite bert for self-supervised learning of language representations

Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. Albert: A lite bert for self-supervised learning of language representations. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=H1eA7AEtvS

work page 2020
[9]

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach.arXiv preprint arXiv:1907.11692, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1907
[10]

Le, and Christopher D

Kevin Clark, Minh-Thang Luong, Quoc V . Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR, 2020

work page 2020
[11]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a uniﬁed text-to-text transformer. Journal of Machine Learning Research, 2020

work page 2020
[12]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwi...

work page 2020
[13]

Scaling Laws for Neural Language Models

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2001
[14]

Neural responding machine for short-text conversation

Lifeng Shang, Zhengdong Lu, and Hang Li. Neural responding machine for short-text conversation. In ACL, 2015. 19

work page 2015
[15]

A neural network approach to context-sensitive generation of conversational responses

Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. A neural network approach to context-sensitive generation of conversational responses. arXiv preprint arXiv:1506.06714, 2015

work page arXiv 2015
[16]

Oriol Vinyals and Quoc V . Le. A neural conversational model. In ICML Workshop, 2015

work page 2015
[17]

arXiv preprint arXiv:2001.09977 , year=

Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, and Quoc V . Le. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977, 2020

work page arXiv 2001
[18]

Smith, Y-Lan Boureau, and Jason Weston

Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, and Jason Weston. Recipes for building an open-domain chatbot. arXiv preprint arXiv:2004.13637, 2020

work page arXiv 2004
[19]

Recurrent neural network based language model

Tomas Mikolov, Martin Karaﬁát, Lukas Burget, Jan Cernock`y, and Sanjeev Khudanpur. Recurrent neural network based language model. In INTERSPEECH, 2010

work page 2010
[20]

Generating text with recurrent neural networks

Ilya Sutskever, James Martens, and Geoffrey E Hinton. Generating text with recurrent neural networks. In ICML, 2011

work page 2011
[21]

Exploring the Limits of Language Modeling

Rafal Józefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016

work page Pith review arXiv 2016
[22]

Universal language model ﬁne-tuning for text classiﬁcation

Jeremy Howard and Sebastian Ruder. Universal language model ﬁne-tuning for text classiﬁcation. InACL, 2018

work page 2018
[23]

Unsupervised representation learning with deep convolutional generative adversarial networks

Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR, 2016

work page 2016
[24]

Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, ...

work page internal anchor Pith review Pith/arXiv arXiv 2021
[25]

Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W

Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander H. Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. Black, Alexander I. Rudnicky, Jason Williams, Joelle Pineau, Mikhail S. Burtsev, and Jason Weston. The second conversational intelligence challenge (convai2). The NeurIPS ’18 Com...

work page 2020
[26]

Personalizing dialogue agents: I have a dog, do you have pets too? ACL, 2018

Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. Personalizing dialogue agents: I have a dog, do you have pets too? ACL, 2018

work page 2018
[27]

A Diversity-Promoting Objective Function for Neural Conversation Models

Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015

work page Pith review arXiv 2015
[28]

Generative deep neural networks for dialogue: A short review

Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, and Joelle Pineau. Generative deep neural networks for dialogue: A short review. arXiv preprint arXiv:1611.06216, 2016

work page arXiv 2016
[29]

Transfertransfo: A transfer learning approach for neural network based conversational agents

Thomas Wolf, Victor Sanh, Julien Chaumond, and Clement Delangue. Transfertransfo: A transfer learning approach for neural network based conversational agents. In NeurIPS Workshop on Conversational AI, 2019

work page 2019
[30]

Dialogpt: Large-scale generative pre-training for conversational response generation

Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, and Bill Dolan. Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536, 2019

work page arXiv 1911
[31]

Retrieval augmentation reduces hallucination in conversation

Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, and Jason Weston. Retrieval augmentation reduces hallucination in conversation. arXiv preprint arXiv:2104.07567, 2021

work page arXiv 2021
[32]

Adam Roberts, Colin Raffel, and Noam Shazeer. How much knowledge can you pack into the parameters of a language model? In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 5418–5426, November 2020

work page 2020
[33]

Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathy Meier-Hellstern, Toju Duke, Lucas Dixon, 20 Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, a...

work page 2021
[34]

arXiv preprint arXiv:1911.00172 , year=

Urvashi Khandelwal, Omer Levy, Dan Jurafsk, Luke Zettlemoyer, and Mike Lewis. Generalization through memorization: Nearest neighbor language models. arXiv preprint arXiv:1911.00172, 2019

work page arXiv 1911
[35]

Retrieval-augmented generation for knowledge-intensive nlp tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. NeurIPS, 2020

work page 2020
[36]

doi:10.48550/arXiv.2002.08909 , abstract =

Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. Realm: Retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909, 2020

work page arXiv 2002
[37]

Leveraging passage retrieval with generative models for open domain question answering

Gautier Izacard and Edouard Grave. Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282, 2021

work page arXiv 2007
[38]

Retrieving and reading: A comprehensive survey on open-domain question answering

Fengbin Zhu, Wenqiang Lei, Chao Wang, Jianming Zheng, Soujanya Poria, and Tat-Seng Chua. Retrieving and reading: A comprehensive survey on open-domain question answering. arXiv preprint arXiv:2101.00774, 2021

work page arXiv 2021
[39]

arXiv preprint arXiv:2004.04906 , year=

Vladimir Karpukhin, Barlas O˘guz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen tau Yih. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020

work page arXiv 2004
[40]

A modern perspective on query likelihood with deep generative retrieval models

Oleg Lesota, Navid Rekabsaz, Daniel Cohen, Klaus Antonius Grasserbauer, Carsten Eickhoff, and Markus Schedl. A modern perspective on query likelihood with deep generative retrieval models. arXiv preprint arXiv:2106.13618, 2021

work page arXiv 2021
[41]

Improving language models by retrieving from trillions of tokens.Preprint arXiv:2112.04426,

Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Ori...

work page arXiv 2021
[42]

Tickettalk: Toward human-level performance with end-to-end, transaction-based dialog systems

Bill Byrne, Karthik Krishnamoorthi, Saravanan Ganesh, and Mihir Sanjay Kale. Tickettalk: Toward human-level performance with end-to-end, transaction-based dialog systems. arXiv preprint arXiv:2012.12458, 2020

work page arXiv 2012
[43]

Reason ﬁrst, then respond: Modular generation for knowledge-infused dialogue

Leonard Adolphs, Kurt Shuster, Jack Urbanek, Arthur Szlam, and Jason Weston. Reason ﬁrst, then respond: Modular generation for knowledge-infused dialogue. arXiv preprint arXiv:2111.05204, 2021

work page arXiv 2021
[44]

WebGPT: Browser-assisted question-answering with human feedback

Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, and John Schulman. Webgpt: Browser-assisted question- answering with human feedback. arXiv preprint arX...

work page internal anchor Pith review Pith/arXiv arXiv 2021
[45]

Internet-augmented dialogue generation

Mojtaba Komeili, Kurt Shuster, and Jason Weston. Internet-augmented dialogue generation. arXiv preprint arXiv:2107.07566, 2021

work page arXiv 2021
[46]

Usr: An unsupervised and reference free evaluation metric for dialog generation

Shikib Mehri and Maxine Eskenazi. Usr: An unsupervised and reference free evaluation metric for dialog generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 681–707, 2020

work page 2020
[47]

BLEU: a method for automatic evaluation of machine translation

Kishore Papineni, Salim Roukos, Todd Ward, and Wei jing Zhu. BLEU: a method for automatic evaluation of machine translation. In ACL, 2002

work page 2002
[48]

How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation

Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, and Joelle Pineau. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing , 2016

work page 2016
[49]

What makes a good conversation? how controllable attributes affect human judgments

Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. What makes a good conversation? how controllable attributes affect human judgments. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, 2019

work page 2019
[50]

Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons

Margaret Li, Jason Weston, and Stephen Roller. Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons. In NeurIPS workshop on Conversational AI, 2019

work page 2019
[51]

Treating dialogue quality evaluation as an anomaly detection problem

Rostislav Nedelchev, Jens Lehmann, and Ricardo Usbeck. Treating dialogue quality evaluation as an anomaly detection problem. In Proceedings of the 12th Conference on Language Resources and Evaluation , pages 508–512, 2020

work page 2020
[52]

On evaluating and comparing conversational agents

Anu Venkatesh, Chandra Khatri, Ashwin Ram, Fenfei Guo, Raefer Gabriel, Ashish Nagar, Rohit Prasad, Ming Cheng, Behnam Hedayatnia, Angeliki Metallinou, Rahul Goel, Shaohua Yang, and Anirudh Raju. On evaluating and comparing conversational agents. NeurIPS, 2017. 21

work page 2017
[53]

Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, and Verena Rieser

Emily Dinan, Gavin Abercrombie, A. Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, and Verena Rieser. Anticipating safety issues in e2e conversational ai: Framework and tooling. arXiv preprint arXiv:2107.03451, 2021

work page arXiv 2021
[54]

Ethical and social risks of harm from Language Models

Laura Weidinger, John Mellor, Maribeth Rauh, Conor Grifﬁn, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, and Iason Gabriel. Ethical and ...

work page internal anchor Pith review Pith/arXiv arXiv 2021
[55]

Dropout distillation

Samuel Rota Bulò, Lorenzo Porzi, and Peter Kontschieder. Dropout distillation. In ICLR, 2016

work page 2016
[56]

The radicalization risks of GPT-3 and advanced neural language models

Kris McGufﬁe and Alex Newhouse. The radicalization risks of GPT-3 and advanced neural language models. arXiv preprint arXiv:2009.06807, 2020

work page arXiv 2009
[57]

Persistent anti-muslim bias in large language models

Abubakar Abid, Maheen Farooqi, and James Zou. Persistent anti-muslim bias in large language models. arXiv preprint arXiv:2101.05783, 2021

work page arXiv 2021
[58]

Man is to computer programmer as woman is to homemaker? debiasing word embeddings

Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, 2016

work page 2016
[59]

Costa-jussà, and Noe Casas

Christine Basta, Marta R. Costa-jussà, and Noe Casas. Evaluating the underlying gender bias in contextualized word embeddings. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing , August 2019

work page 2019
[60]

Measuring bias in contextualized word representations

Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yulia Tsvetkov. Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, August 2019

work page 2019
[61]

Lu, Mohamed Abdalla, Matthew McDermott, and Marzyeh Ghassemi

Haoran Zhang, Amy X. Lu, Mohamed Abdalla, Matthew McDermott, and Marzyeh Ghassemi. Hurtful words: Quantifying biases in clinical contextual word embeddings. In Proceedings of the ACM Conference on Health, Inference, and Learning, 2020

work page 2020
[62]

The woman worked as a babysitter: On biases in language generation

Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. The woman worked as a babysitter: On biases in language generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , 2019

work page 2019
[63]

Gender bias in contextualized word embeddings

Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, and Kai-Wei Chang. Gender bias in contextualized word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), June 2019

work page 2019
[64]

Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases

Wei Guo and Aylin Caliskan. Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases. arXiv preprint arXiv:2006.03955, 2020

work page arXiv 2006
[65]

Perturbation sensitivity analysis to detect unintended model biases

Vinodkumar Prabhakaran, Ben Hutchinson, and Margaret Mitchell. Perturbation sensitivity analysis to detect unintended model biases. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2019

work page 2019
[66]

Bowman and Rachel Rudinger , title =

Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. On measuring social biases in sentence encoders. arXiv preprint arXiv:1903.10561, 2019

work page arXiv 1903
[67]

Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019

work page 2019
[68]

Shikha Bordia and Samuel R. Bowman. Identifying and reducing gender bias in word-level language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, 2019

work page 2019
[69]

On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 2021

Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 2021

work page 2021
[70]

Smith, and Yejin Choi

Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, and Yejin Choi. Social bias frames: Reasoning about social and power implications of language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

work page 2020
[71]

Social biases in NLP models as barriers for persons with disabilities

Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, and Stephen Denuyl. Social biases in NLP models as barriers for persons with disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

work page 2020
[72]

Large language models associate muslims with violence

Abubakar Abid, Maheen Farooqi, and James Zou. Large language models associate muslims with violence. Nature Machine Intelligence, 2021. 22

work page 2021
[73]

Extracting training data from large language models

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. arXiv preprint arXiv:2012.07805, 2020

work page arXiv 2012
[74]

Chi, and Alex Beutel

Sahaj Garg, Vincent Perot, Nicole Limtiaco, Ankur Taly, Ed H. Chi, and Alex Beutel. Counterfactual fairness in text classiﬁcation through robustness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019. ISBN 9781450363242

work page 2019
[75]

Reducing sentiment bias in language models via counterfactual evaluation

Po-Sen Huang, Huan Zhang, Ray Jiang, Robert Stanforth, Johannes Welbl, Jack Rae, Vishal Maini, Dani Yogatama, and Pushmeet Kohli. Reducing sentiment bias in language models via counterfactual evaluation. In EMNLP (Findings), 2020

work page 2020
[76]

A scalable approach to reducing gender bias in google translate

Melvin Johnson. A scalable approach to reducing gender bias in google translate. https://ai.googleblog. com/2020/04/a-scalable-approach-to-reducing-gender.html , 2020

work page 2020
[77]

Reducing gender bias in word-level language models with a gender-equalizing loss function

Yusu Qian, Urwa Muaz, Ben Zhang, and Jae Won Hyun. Reducing gender bias in word-level language models with a gender-equalizing loss function. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, July 2019

work page 2019
[78]

Towards debiasing sentence representations

Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, and Louis-Philippe Morency. Towards debiasing sentence representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020

work page 2020
[79]

Recipes for safety in open-domain chatbots

Margaret Li Y-Lan Boureau Jason Weston Emily Dinan Jing Xu, Da Ju. Recipes for safety in open-domain chatbots. arXiv preprint arXiv:2010.07079, 2020

work page arXiv 2010
[80]

Dexperts: Decoding-time controlled text generation with experts and anti-experts.arXiv preprint arXiv:2105.03023, 2021

Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, and Yejin Choi. On-the-ﬂy controlled text generation with experts and anti-experts. arXiv preprint arXiv:2105.03023, 2021

work page arXiv 2021
[81]

Bot-adversarial dialogue for safe conversational agents

Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, and Emily Dinan. Bot-adversarial dialogue for safe conversational agents. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

work page 2021

Showing first 80 references.