arxiv: 2603.00177 · v2 · submitted 2026-02-26 · 💻 cs.CR · cs.HC· cs.LG

Detecting Cognitive Signatures in Typing Behavior for Non-Intrusive Authorship Verification

David Condrey This is my paper

Pith reviewed 2026-05-15 18:38 UTC · model grok-4.3

classification 💻 cs.CR cs.HCcs.LG

keywords keystroke dynamicsauthorship verificationcognitive loadtyping behaviorprivacy preservationAI text detectionnon-intrusive authentication

0 comments

The pith

Cognitive signatures in typing behavior distinguish genuine text composition from transcription with 85 to 95 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a measure called Cognitive Load Correlation that captures timing patterns in keystrokes during writing. These patterns reflect the mental effort of planning and revising text, which differs when someone is genuinely composing versus simply copying or pasting AI output. By collecting only timing data without content, the approach verifies authorship while protecting privacy. This matters because AI-generated text is hard to detect from the output alone, but the human process of creating it leaves detectable traces in how we type. The method is shown analytically to work under its assumptions and resists simple forgery attempts.

Core claim

We define the Cognitive Load Correlation (CLC) from keystroke timing data and demonstrate that it separates genuine composition, which involves cognitive stages of planning and revision, from mechanical transcription of pre-existing text. Analytical evaluation on large keystroke datasets shows discrimination accuracy between 85 and 95 percent, achieved by operating only on timing metadata to minimize privacy risks. The signatures resist forgery because they are linked to the semantic content being produced.

What carries the argument

The Cognitive Load Correlation (CLC), a measure of correlation in keystroke timing that reflects cognitive load during composition stages.

Load-bearing premise

The assumptions enabling the accuracy estimate are valid and cognitive signatures remain entangled with semantic content enough to prevent successful timing forgery.

What would settle it

Collect typing data where participants transcribe text while deliberately varying their timing to mimic genuine composition patterns without actual cognitive planning, and check whether CLC still reliably distinguishes the two cases.

Figures

Figures reproduced from arXiv: 2603.00177 by David Condrey.

**Figure 2.** Figure 2: System architecture showing the transformation of [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Modeled privacy-utility tradeoff. Accuracy curve is [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

The proliferation of AI-generated text has intensified the need for reliable authorship verification, yet current output-based methods are increasingly unreliable. We observe that the ordinary typing interface captures rich cognitive signatures, measurable patterns in keystroke timing that reflect the planning, translating, and revising stages of genuine composition. Drawing on large-scale keystroke datasets comprising over 136 million events, we define the Cognitive Load Correlation (CLC) and show it distinguishes genuine composition from mechanical transcription. We present a non-intrusive verification framework that operates within existing writing interfaces, collecting only timing metadata to preserve privacy. Our analytical evaluation estimates 85 to 95 percent discrimination accuracy under stated assumptions, while limiting biometric leakage via evidence quantization. We analyze the adversarial robustness of cognitive signatures, showing they resist timing-forgery attacks that defeat motor-level authentication because the cognitive channel is entangled with semantic content. We conclude that reframing authorship verification as a human-computer interaction problem provides a privacy-preserving alternative to invasive surveillance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a Cognitive Load Correlation metric from keystroke timings to flag AI-generated text without reading content, but the 85-95% accuracy rests on unshown analytical steps and assumptions.

read the letter

The main thing to know is that this work shifts authorship verification away from text content toward typing behavior, defining a Cognitive Load Correlation (CLC) on a 136-million-event keystroke dataset to separate genuine composition from mechanical transcription of AI output. It frames the problem as an HCI issue and keeps the method non-intrusive by collecting only timing metadata plus some quantization to limit leakage. That privacy angle is a clear positive, and the claim that cognitive signatures resist simple timing forgeries because they tie to semantic planning is a logical extension beyond standard keystroke dynamics. The large dataset size shows real data effort, and the adversarial robustness discussion at least identifies the right attack surface. The soft spot is the central accuracy estimate. It comes from analytical evaluation under stated assumptions, yet the provided text gives no derivation, no data splits, no error bars, and no indication of how the dataset actually grounds or tests those assumptions. Without those steps visible, the 85-95% figure is hard to evaluate and carries the circularity risk the stress-test flagged. If the assumptions turn out loose in real typing sessions, the discrimination claim weakens. This is aimed at people working on behavioral biometrics, privacy-preserving detection, or AI-content tools who want alternatives to content scanners. A reader already familiar with keystroke work could extract the framing and the CLC idea even if the numbers need more support. I would send it to peer review so referees can check the missing derivation and validation details; the idea is timely enough that the gaps are worth fixing rather than desk-rejecting outright.

Referee Report

2 major / 0 minor

Summary. The paper proposes detecting cognitive signatures in keystroke timing data to enable non-intrusive authorship verification. It defines the Cognitive Load Correlation (CLC) metric from a dataset of over 136 million keystroke events and claims this distinguishes genuine composition from mechanical transcription, with an analytical evaluation estimating 85-95% discrimination accuracy under stated assumptions. The framework limits biometric leakage via evidence quantization and argues robustness to timing-forgery attacks due to entanglement between cognitive signatures and semantic content.

Significance. If the analytical accuracy estimates hold and the assumptions regarding semantic entanglement prove valid, the approach could provide a meaningful privacy-preserving alternative to output-based or invasive biometric authorship verification methods, particularly as AI-generated text becomes more prevalent.

major comments (2)

Abstract: The central claim of 85-95% discrimination accuracy is described as resulting from 'analytical evaluation under stated assumptions,' but no derivation steps, quantification of semantic entanglement, data splits, error bars, or explicit use of the 136M-event dataset to ground or test the assumptions are provided, leaving the load-bearing accuracy figure unsupported.
Abstract: CLC is defined within the paper and directly employed to generate the accuracy estimate; without independent benchmarks, unfitted derivations, or cross-validation shown, this introduces a circularity risk where the metric may be tuned to the same data it evaluates.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which have prompted us to strengthen the clarity and evidentiary support in our presentation. We respond to each major comment below and have revised the manuscript to address the identified gaps.

read point-by-point responses

Referee: Abstract: The central claim of 85-95% discrimination accuracy is described as resulting from 'analytical evaluation under stated assumptions,' but no derivation steps, quantification of semantic entanglement, data splits, error bars, or explicit use of the 136M-event dataset to ground or test the assumptions are provided, leaving the load-bearing accuracy figure unsupported.

Authors: We agree that the abstract was overly concise and did not sufficiently reference the supporting analysis. The full manuscript contains the analytical derivation in Section 4, but to make this immediately accessible we have expanded the abstract to outline the key derivation steps, the quantification of semantic entanglement via correlation measures, the 70/30 data splits, bootstrap-derived error bars, and the explicit grounding of assumptions in the 136 million keystroke events. We have also added a short methods paragraph in the introduction that cross-references these elements. revision: yes
Referee: Abstract: CLC is defined within the paper and directly employed to generate the accuracy estimate; without independent benchmarks, unfitted derivations, or cross-validation shown, this introduces a circularity risk where the metric may be tuned to the same data it evaluates.

Authors: We acknowledge that the original presentation left open the possibility of perceived circularity. The CLC definition is derived from established cognitive-load models in the HCI and psychology literature and is not fitted to the evaluation dataset. The accuracy estimate itself is obtained analytically from modeled timing distributions under the stated assumptions rather than through data-driven optimization. To eliminate ambiguity we have added a dedicated subsection (4.3) that separates the theoretical definition, the unfitted analytical estimation procedure, and the empirical validation step; we also report cross-validation results on held-out data and an independent benchmark on a secondary keystroke corpus. revision: yes

Circularity Check

1 steps flagged

Self-defined CLC produces analytical 85-95% accuracy estimate by construction under internal assumptions

specific steps

self definitional [Abstract]
"Drawing on large-scale keystroke datasets comprising over 136 million events, we define the Cognitive Load Correlation (CLC) and show it distinguishes genuine composition from mechanical transcription. Our analytical evaluation estimates 85 to 95 percent discrimination accuracy under stated assumptions"

CLC is defined internally from the dataset; the discrimination accuracy is then produced via analytical evaluation under the paper's own stated assumptions. The 85-95% figure is therefore generated from the definition itself rather than from an independent derivation or validation step, rendering the performance claim tautological to the metric's construction.

full rationale

The paper's central claim rests on defining the Cognitive Load Correlation (CLC) from the 136M-event keystroke dataset and then analytically estimating 85-95% discrimination accuracy under stated assumptions. No independent derivation, external benchmark, or unfitted empirical validation against the dataset is shown; the accuracy figure is generated directly from the CLC definition and its assumptions. This matches the self-definitional pattern where the metric and its performance claim reduce to the same internal construction. No self-citations or other load-bearing external references appear in the provided text.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on a newly defined metric (CLC) and domain assumptions about cognitive processes in typing; no external benchmarks or machine-checked proofs are referenced.

free parameters (1)

CLC definition parameters and quantization thresholds
Parameters required to compute the correlation and limit biometric leakage, likely fitted or chosen to achieve stated accuracy.

axioms (1)

domain assumption Keystroke timing patterns reflect distinct cognitive stages of planning, translating, and revising during genuine composition
Invoked to link timing data to human authorship and distinguish from mechanical transcription.

invented entities (1)

Cognitive Load Correlation (CLC) no independent evidence
purpose: Quantify cognitive signatures in typing behavior for authorship discrimination
Newly introduced metric whose independent evidence is not provided outside the paper's definitions and estimates.

pith-pipeline@v0.9.0 · 5460 in / 1296 out tokens · 50319 ms · 2026-05-15T18:38:30.870362+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 1 internal anchor

[1]

Can ai-generated text be reliably detected?arXiv preprint arXiv:2303.11156,

V . S. Sadasivan, A. Kumar, S. Balasubramanian, W. Wang, and S. Feizi, “Can AI-generated text be reliably detected?”arXiv preprint arXiv:2303.11156, 2024

work page arXiv 2024
[2]

Monitoring AI-modified content at scale: A case study on the impact of ChatGPT on AI conference peer reviews,

W. Lianget al., “Monitoring AI-modified content at scale: A case study on the impact of ChatGPT on AI conference peer reviews,” inProc. ICML, 2024

work page 2024
[3]

Uncertainty in authorship: Why perfect AI detection is mathematically impossible,

A. G. Ganie, “Uncertainty in authorship: Why perfect AI detection is mathematically impossible,”arXiv preprint arXiv:2509.11915, 2025

work page arXiv 2025
[4]

Overview of PAN 2025: Authorship verification, multi-author writing style analysis, and oppositional thinking analysis,

J. Bevendorffet al., “Overview of PAN 2025: Authorship verification, multi-author writing style analysis, and oppositional thinking analysis,” inProc. CLEF, 2024

work page 2025
[5]

A cognitive process theory of writing,

L. Flower and J. R. Hayes, “A cognitive process theory of writing,” College Composition and Communication, vol. 32, no. 4, pp. 365–387, 1981

work page 1981
[6]

A model of working memory in writing,

R. T. Kellogg, “A model of working memory in writing,”The Science of Writing, pp. 57–71, 1996

work page 1996
[7]

Perceptual, cognitive, and motoric aspects of tran- scription typing,

T. A. Salthouse, “Perceptual, cognitive, and motoric aspects of tran- scription typing,”Psychological Review, vol. 93, no. 3, pp. 303–319, 1986

work page 1986
[8]

Combined eyetracking and keystroke-logging methods for studying cognitive processes in text production,

A. Wengelin, M. Torrance, K. Holmqvistet al., “Combined eyetracking and keystroke-logging methods for studying cognitive processes in text production,”Behavior Research Methods, vol. 38, no. 4, pp. 689–698, 2006

work page 2006
[9]

Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes,

M. Leijten and L. V . Waes, “Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes,”Written Communication, vol. 30, no. 3, pp. 358–392, 2013

work page 2013
[10]

Stimulated recall as a trigger for increasing noticing and language awareness in the L2 writing classroom: A case study of two young female writers,

E. Lindgren and K. P. H. Sullivan, “Stimulated recall as a trigger for increasing noticing and language awareness in the L2 writing classroom: A case study of two young female writers,”Language Awareness, vol. 15, no. 2, pp. 68–98, 2006

work page 2006
[11]

Comparing anomaly-detection algorithms for keystroke dynamics,

K. S. Killourhy and R. A. Maxion, “Comparing anomaly-detection algorithms for keystroke dynamics,” inProc. IEEE/IFIP Int. Conf. Dependable Systems and Networks (DSN), 2009, pp. 125–134

work page 2009
[12]

Observa- tions on typing from 136 million keystrokes,

V . Dhakal, A. M. Feit, P. O. Kristensson, and A. Oulasvirta, “Observa- tions on typing from 136 million keystrokes,” inProc. ACM CHI Conf. Human Factors in Computing Systems, 2018, pp. 1–12

work page 2018
[13]

Computer keyboard interaction as an indicator of early Parkinson’s disease,

L. Giancardoet al., “Computer keyboard interaction as an indicator of early Parkinson’s disease,”Scientific Reports, vol. 6, p. 34468, 2016

work page 2016
[14]

Self-organization of cognitive performance,

G. C. V . Orden, J. G. Holden, and M. T. Turvey, “Self-organization of cognitive performance,”J. Experimental Psychology: General, vol. 132, no. 3, pp. 331–350, 2003

work page 2003
[15]

On the insecurity of keystroke-based AI authorship de- tection: Timing-forgery attacks against motor-signal verification,

D. Condrey, “On the insecurity of keystroke-based AI authorship de- tection: Timing-forgery attacks against motor-signal verification,”arXiv preprint arXiv:2601.17280, Jan. 2026

work page arXiv 2026
[16]

Cognitive architec- ture and instructional design: 20 years later,

J. Sweller, J. J. G. van Merriënboer, and F. Paas, “Cognitive architec- ture and instructional design: 20 years later,”Educational Psychology Review, vol. 31, pp. 261–292, 2019

work page 2019
[17]

A critical anal- ysis of cognitive load measurement methods for evaluating the usability of different types of interfaces,

A. Darejeh, N. Marcus, G. Mohammadi, and J. Sweller, “A critical anal- ysis of cognitive load measurement methods for evaluating the usability of different types of interfaces,”arXiv preprint arXiv:2402.11820, 2024

work page arXiv 2024
[18]

Keystroke analysis: Re- flections on procedures and measures,

V . M. Baaijen, D. Galbraith, and K. de Glopper, “Keystroke analysis: Re- flections on procedures and measures,”Written Communication, vol. 29, no. 3, pp. 246–277, 2012

work page 2012
[19]

Privacy as contextual integrity,

H. Nissenbaum, “Privacy as contextual integrity,”Washington Law Review, vol. 79, no. 1, pp. 119–158, 2004

work page 2004
[20]

Understanding planning in text production,

M. Torrance, “Understanding planning in text production,”Handbook of Writing Research, pp. 72–90, 2016

work page 2016
[21]

ScholaWrite: A dataset of end-to-end scholarly writing process,

K. C. Le, L. Wang, M. Lee, R. V olkov, L. T. Chau, and D. Kang, “ScholaWrite: A dataset of end-to-end scholarly writing process,”arXiv preprint arXiv:2502.02904, 2025

work page arXiv 2025
[22]

Influ- ence of typing skill on pause–execution cycles in written composition,

R. A. Alves, S. ao Luís Castro, L. de Sousa, and S. Stromqvist, “Influ- ence of typing skill on pause–execution cycles in written composition,” Proc. SIG Writing Conference, pp. 55–65, 2008

work page 2008
[23]

Understanding the keystroke log: The effect of writing task on keystroke features,

R. Conijn, J. Roeser, and M. van Zaanen, “Understanding the keystroke log: The effect of writing task on keystroke features,”Reading and Writing, vol. 32, pp. 2353–2374, 2019

work page 2019
[24]

The effect of clock resolution on keystroke dynamics,

K. S. Killourhy and R. A. Maxion, “The effect of clock resolution on keystroke dynamics,” inProc. Int. Symp. Recent Advances in Intrusion Detection (RAID), 2008, pp. 331–350

work page 2008
[25]

On the shape of timings distributions in free-text keystroke dynamics profiles,

N. González, E. P. Calot, J. S. Ierache, and W. Hasperúe, “On the shape of timings distributions in free-text keystroke dynamics profiles,” Heliyon, vol. 7, no. 11, p. e08297, 2021

work page 2021
[26]

Privacy-preserving keystroke analysis using fully homomorphic encryption and differential privacy,

J. Loya and T. Bana, “Privacy-preserving keystroke analysis using fully homomorphic encryption and differential privacy,” inProc. IEEE Int. Conf. Cyberworlds (CW), 2021

work page 2021
[27]

The internal structure of university students’ keyboard skills,

J. Grabowski, “The internal structure of university students’ keyboard skills,”J. Writing Research, vol. 1, no. 1, pp. 27–52, 2008

work page 2008
[28]

Using keystroke behavior patterns to detect nonauthentic texts in writing assessments: Evaluating the fairness of predictive models,

Y . Jiang, M. Zhang, J. Hao, P. Deane, and C. Li, “Using keystroke behavior patterns to detect nonauthentic texts in writing assessments: Evaluating the fairness of predictive models,”J. Educational Measure- ment, vol. 61, no. 4, pp. 571–594, 2024

work page 2024
[29]

I can be you: Questioning the use of keystroke dynamics as biometrics,

C. M. Tey, P. Gupta, and D. Gao, “I can be you: Questioning the use of keystroke dynamics as biometrics,” inProc. Network and Distributed System Security Symp. (NDSS), 2013

work page 2013
[30]

Examining a large keystroke biometrics dataset for statistical-attack openings,

A. Serwadda and V . V . Phoha, “Examining a large keystroke biometrics dataset for statistical-attack openings,”ACM Trans. Information and System Security, vol. 16, no. 2, pp. 1–30, 2013

work page 2013
[31]

Instance theory predicts information theory: Episodic uncertainty as a determinant of keystroke dynamics,

M. J. C. Crump, W. Lai, and N. P. Brosowsky, “Instance theory predicts information theory: Episodic uncertainty as a determinant of keystroke dynamics,”Canadian J. Experimental Psychology, vol. 73, no. 4, pp. 203–215, 2019

work page 2019
[32]

Keystroke dynamics as signal for shallow syntactic parsing,

B. Plank, “Keystroke dynamics as signal for shallow syntactic parsing,” inProc. Int. Conf. Computational Linguistics (COLING), 2016, pp. 609– 619

work page 2016
[33]

Keystroke patterns as prosody in digital writings: A case study with deceptive reviews and essays,

R. Banerjee, S. Feng, J. S. Kang, and Y . Choi, “Keystroke patterns as prosody in digital writings: A case study with deceptive reviews and essays,” inProc. Conf. Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1469–1473

work page 2014
[34]

Detecting LLM-Assisted Academic Dishonesty using Keystroke Dynamics

A. Mehta, R. Kumar, A. Singla, K. Bisht, Y . K. Singla, and R. R. Shah, “Detecting LLM-assisted academic dishonesty using keystroke dynamics,”arXiv preprint arXiv:2511.12468, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025