GLTR: Statistical Detection and Visualization of Generated Text

[GSR19] Sebastian Gehrmann, Hendrik Strobelt, Alexander M · 2019 · cs.CL · arXiv 1906.04043

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

open full Pith review browse 9 citing papers arXiv PDF

abstract

The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common sampling schemes. In a human-subjects study, we show that the annotation scheme provided by GLTR improves the human detection-rate of fake text from 54% to 72% without any prior training. GLTR is open-source and publicly deployed, and has already been widely used to detect generated outputs

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Language Models are Few-Shot Learners

cs.CL · 2020-05-28 · accept · novelty 8.0

GPT-3 shows that scaling an autoregressive language model to 175 billion parameters enables strong few-shot performance across diverse NLP tasks via in-context prompting without fine-tuning.

SV-Detect: AI-generated Text Detection with Steering Vectors

cs.CL · 2026-06-05 · unverdicted · novelty 7.0

Steering vectors from frozen LM layers enable a lightweight classifier to detect machine-generated text robustly across domains, source models, and editing attacks.

SoK: Exposing the Generation and Detection Gaps in LLM-Generated Phishing

cs.CR · 2025-08-29 · unverdicted · novelty 7.0

This SoK paper introduces a nine-stage taxonomy for LLM guardrail breaches in phishing, characterizes evasion and manipulation tactics, and identifies a dynamic-offense versus static-defense asymmetry.

ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability

cs.CL · 2025-02-17 · unverdicted · novelty 7.0

ExaGPT uses span-level similarity retrieval from human and LLM datastores to detect machine-generated text while supplying the matching spans as human-interpretable evidence, achieving up to 37-point accuracy gains over prior interpretable detectors at 1% FPR.

GigaCheck: Detecting LLM-generated Content via Object-Centric Span Localization

cs.CL · 2024-10-31 · unverdicted · novelty 6.0

GigaCheck detects LLM-generated text at both document and span levels by combining fine-tuned language-model embeddings with a DETR-like architecture that treats generated intervals as detectable objects.

Can AI-Generated Text be Reliably Detected?

cs.CL · 2023-03-17 · unverdicted · novelty 6.0

Recursive paraphrasing attacks substantially lower detection rates for multiple AI text detectors with only minor quality loss, while a theoretical analysis ties best-case AUROC to total variation distance between human and AI distributions.

Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection

cs.CL · 2026-05-15 · unverdicted · novelty 5.0

A multi-level framework that models local and global relations among token detection scores to improve machine-generated text detection with low overhead.

Detecting LLM-Assisted Academic Dishonesty using Keystroke Dynamics

cs.HC · 2025-11-16 · unverdicted · novelty 5.0

Keystroke dynamics models outperform text-only detectors for spotting LLM-assisted academic dishonesty in practical scenarios, though performance drops under adversarial conditions.

Findings of the Counter Turing Test: AI-Generated Text Detection

cs.CL · 2026-05-20 · unverdicted · novelty 2.0 · 2 refs

Shared task findings show near-perfect binary detection of AI-generated text but greater difficulty in attributing outputs to particular language models.

citing papers explorer

Showing 9 of 9 citing papers.

Language Models are Few-Shot Learners cs.CL · 2020-05-28 · accept · none · ref 18
GPT-3 shows that scaling an autoregressive language model to 175 billion parameters enables strong few-shot performance across diverse NLP tasks via in-context prompting without fine-tuning.
SV-Detect: AI-generated Text Detection with Steering Vectors cs.CL · 2026-06-05 · unverdicted · none · ref 25 · internal anchor
Steering vectors from frozen LM layers enable a lightweight classifier to detect machine-generated text robustly across domains, source models, and editing attacks.
SoK: Exposing the Generation and Detection Gaps in LLM-Generated Phishing cs.CR · 2025-08-29 · unverdicted · none · ref 37 · internal anchor
This SoK paper introduces a nine-stage taxonomy for LLM guardrail breaches in phishing, characterizes evasion and manipulation tactics, and identifies a dynamic-offense versus static-defense asymmetry.
ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability cs.CL · 2025-02-17 · unverdicted · none · ref 9 · internal anchor
ExaGPT uses span-level similarity retrieval from human and LLM datastores to detect machine-generated text while supplying the matching spans as human-interpretable evidence, achieving up to 37-point accuracy gains over prior interpretable detectors at 1% FPR.
GigaCheck: Detecting LLM-generated Content via Object-Centric Span Localization cs.CL · 2024-10-31 · unverdicted · none · ref 18 · internal anchor
GigaCheck detects LLM-generated text at both document and span levels by combining fine-tuned language-model embeddings with a DETR-like architecture that treats generated intervals as detectable objects.
Can AI-Generated Text be Reliably Detected? cs.CL · 2023-03-17 · unverdicted · none · ref 89 · internal anchor
Recursive paraphrasing attacks substantially lower detection rates for multiple AI text detectors with only minor quality loss, while a theoretical analysis ties best-case AUROC to total variation distance between human and AI distributions.
Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection cs.CL · 2026-05-15 · unverdicted · none · ref 14 · internal anchor
A multi-level framework that models local and global relations among token detection scores to improve machine-generated text detection with low overhead.
Detecting LLM-Assisted Academic Dishonesty using Keystroke Dynamics cs.HC · 2025-11-16 · unverdicted · none · ref 13 · internal anchor
Keystroke dynamics models outperform text-only detectors for spotting LLM-assisted academic dishonesty in practical scenarios, though performance drops under adversarial conditions.
Findings of the Counter Turing Test: AI-Generated Text Detection cs.CL · 2026-05-20 · unverdicted · none · ref 8 · 2 links · internal anchor
Shared task findings show near-perfect binary detection of AI-generated text but greater difficulty in attributing outputs to particular language models.

GLTR: Statistical Detection and Visualization of Generated Text

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer