Whose Agent Are You? Multi-Layer Fingerprinting and Attribution of Autonomous Web Agents

Amir Houmansadr; Dayeon Kang; Hyejun Jeong; Jade Sheffey; Pubali Datta

arxiv: 2606.20910 · v1 · pith:57X4NAHGnew · submitted 2026-06-18 · 💻 cs.CR · cs.AI

Whose Agent Are You? Multi-Layer Fingerprinting and Attribution of Autonomous Web Agents

Dayeon Kang , Hyejun Jeong , Jade Sheffey , Pubali Datta , Amir Houmansadr This is my paper

Pith reviewed 2026-06-26 16:33 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords web agentsfingerprintingAI agentsattributionweb securitybrowser behaviornetwork fingerprintingdecision tree

0 comments

The pith

Multi-layer fingerprints based on network and browser behavior distinguish AI web agents from humans and legacy crawlers at 97 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that autonomous AI web agents, which pair large language models with browser-level control, produce detectable structural patterns in their TLS and HTTP connections plus their browser interaction sequences. These patterns differ enough across agent frameworks to support reliable attribution, which matters because existing defenses such as robots.txt are routinely ignored and traditional bot detection can be evaded. By logging and classifying traffic from six major agent systems, the authors show that a decision tree model can isolate individual agent architectures while separating them from both human browsing and older crawlers. The approach relies on passive, cross-layer observations rather than active blocking, offering a potential route to enforce content access policies on instrumented domains.

Core claim

The authors demonstrate that AI web agents can be effectively distinguished from humans and traditional crawlers using a multi-layer fingerprint based on both network layer characteristics (e.g., TLS, HTTP) and browser interaction behavior. By analyzing six prominent agent frameworks, they uncover latent structural differences in how these systems assemble HTTP requests, establish TLS/HTTP connections, and execute autonomous browser actions. Feeding these multi-layer features into a decision tree classifier achieves 97 percent accuracy in identification, isolating distinct agent architectures.

What carries the argument

Multi-layer fingerprint consisting of network-layer details (TLS and HTTP characteristics) combined with browser interaction behavior, processed through a decision tree classifier.

If this is right

Web servers can deploy the logging framework on live domains to attribute incoming agent traffic to specific frameworks.
Content owners gain an evasion-resistant method to enforce access policies against automated scraping.
Different agent architectures become distinguishable even when they share similar high-level goals.
Agent traffic can be separated from human browsing baselines without relying on user-agent strings or robots.txt compliance.
Legacy crawler detection improves when multi-layer signals supplement existing heuristics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Operators of AI agents may need to introduce deliberate randomization in request patterns to reduce identifiability.
The same cross-layer approach could be applied to other autonomous systems that control browsers or make network calls.
Widespread deployment might shift the arms race toward agents that actively mimic human timing and connection behavior.
Attribution data collected this way could inform policy debates on regulating large-scale automated web access.

Load-bearing premise

The observed structural differences across the six tested agent frameworks stay consistent enough to serve as stable identifiers even if the agents are updated or reconfigured.

What would settle it

Retraining the classifier on traffic from modified or newly released versions of the same agent frameworks and measuring whether accuracy falls below 80 percent would test whether the fingerprints remain reliable.

Figures

Figures reproduced from arXiv: 2606.20910 by Amir Houmansadr, Dayeon Kang, Hyejun Jeong, Jade Sheffey, Pubali Datta.

**Figure 1.** Figure 1: Overview of MARK. Our Multi-layer Agent fingerprinting framewoRK (MARK) consists of four stages: (1) configuring agents with a common web-interaction instruction, (2) collecting raw network and client-side traces as each agent visits URLs and performs controlled UX tasks, (3) extracting TLS, HTTP, and behavioral features from the segmented traces, and (4) using the resulting feature vectors to attribute tr… view at source ↗

**Figure 2.** Figure 2: Comparison of request timing (Inter-RequestIntervals and Inter-Event-Intervals). A similar tendency between IRI and IEI indicates the pacing strategies of agents. capture consistent behavioral patterns in page inspection, action selection, timing, and interaction order. We use these repeated traces to support more generalizable comparisons across agents, even for consistent and systemic browsing actions o… view at source ↗

**Figure 3.** Figure 3: Comparison results of the request rate sent by the agent and its Coefficient of Variation. This feature shows the stability and pacing of each agent. between Claude and Skyvern, which are considered visionbased agents, implies that the agent type does not strongly determine IRI behavior. We also analyze Inter-Event Intervals (IEI), derived from the timing of web component interactions, as illustrated in … view at source ↗

**Figure 4.** Figure 4: Mean and standard deviation values of mouse trajectory length. It implies agent type and strategy for human-mimicking behavior. Assuming that different types of agents affect decisionmaking and web-browsing strategies, and that these differences would be reflected in web component interaction behavior, we analyze behavioral features from short web component interaction logs at the website end. Interesti… view at source ↗

**Figure 5.** Figure 5: Web component interaction proportion for each web agent on different pages. The event profile proportion suggests a web content exploration strategy of agents, and it is different for agents even though the network packet structures are identical. (S5 webpage is skipped because Claude, Gemini, and Skyvern have no record for it.) 0 20 40 60 80 100 120 Keydowns/Session 0 200 400 600 800 1000 1200 1400 Mouse … view at source ↗

**Figure 6.** Figure 6: The key downs and mouse movements per session clustered regions across agents. Agents generally use keyboard controls rather than mouse movements, except for Skyvern, which presents human-like behavior. Finding 7. Skyvern exhibits a uniquely humanlike interaction signature. As a vision-based agent, Skyvern generates substantially longer mouse trajectories and the highest mouse-move activity across pages,… view at source ↗

**Figure 7.** Figure 7: Agent identification performance comparison as the number of requests seen at the website end. Protocol-level fingerprints preserve more than 60% agent discrimination after 3 requests are received, and behavioral replenishes the performance after signals are accumulated. trial results reported in [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Task instruction for autonomous web interaction. It guides the agent to sequentially visit each target URL, inspect the page, perform one natural interaction with synthetic inputs, and immediately proceed to the next page [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Scenario 1, version 1 [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 13.** Figure 13: Scenario 3 [PITH_FULL_IMAGE:figures/full_fig_p017_13.png] view at source ↗

**Figure 14.** Figure 14: Scenario 4 [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗

**Figure 15.** Figure 15: Scenario 5 [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗

read the original abstract

As AI web agents proliferate, combining large language models with autonomous, browser-level control, indiscriminate content scraping by web agents has emerged as a privacy and security challenge. Existing defenses, such as robots.txt and active bot-blocking, are insufficient, as they are widely violated and easily circumvented. In this work, we demonstrate that AI web agents can be effectively distinguished from humans and traditional crawlers using a multi-layer fingerprint based on both network layer characteristics (e.g., TLS, HTTP) and browser interaction behavior. We implement this mechanism as a programmatic logging framework that can be deployed on a live, instrumented domain. By analyzing six prominent agent frameworks (AutoGen, Browser Use, Claude, Gemini, Operator, and Skyvern), we uncover latent structural differences in how these systems assemble HTTP requests, establish TLS/HTTP connections, and execute autonomous browser actions. Feeding these multi-layer features into a decision tree classifier, our framework achieves high-fidelity identification (97% accuracy), successfully isolating distinct agent architectures and differentiating agent traffic from both human browsing baselines and legacy crawlers. Our findings demonstrate that cross-layer agent tracking provides a robust, evasion-resistant strategy for content protection and web security policy enforcement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Multi-layer fingerprinting distinguishes six AI agent frameworks at 97% accuracy on unmodified runs, but the evaluation skips any reconfiguration or version-change tests.

read the letter

The main thing to know is that this paper shows concrete differences in how six agent frameworks assemble TLS/HTTP requests and perform browser actions, letting a decision tree reach 97% attribution accuracy while separating them from human traffic and legacy crawlers.

They extend classic fingerprinting to LLM-driven agents by logging both network-layer traits and interaction sequences, then deploy a simple logging framework on an instrumented domain. That produces usable distinctions across AutoGen, Browser Use, Claude, Gemini, Operator, and Skyvern. The practical output—a deployable logger—is the part that could actually help site operators.

The numbers hold for the exact setups they ran. The multi-layer features capture real structural patterns rather than noise, and the separation from baselines is clear.

The soft spot is exactly what the stress-test flags: no trials alter headers, navigation policies, or framework versions. If any of those shift the feature distributions, the 97% figure stops supporting an evasion-resistant claim. Dataset size, collection method, and cross-validation details are also missing from the abstract, so the result is harder to weigh without the full methods section.

This is for people working on web bot detection and content protection who need an attribution tool against AI agents. A reader looking for a concrete starting method will find something to try. The work is grounded enough to deserve referee time, even though the robustness tests are the obvious next step.

I'd send it to peer review and ask for the missing evaluation details plus at least one reconfiguration experiment.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces a multi-layer fingerprinting framework for identifying autonomous AI web agents. It collects features from TLS/HTTP request assembly and browser action sequences across six agent frameworks (AutoGen, Browser Use, Claude, Gemini, Operator, Skyvern), along with human and legacy crawler baselines. These features are fed into a decision tree classifier, which the authors report achieves 97% accuracy in attributing agent traffic and distinguishing it from human browsing and traditional crawlers. The approach is implemented as a logging framework for live domains and positioned as an evasion-resistant defense against indiscriminate scraping.

Significance. If the central performance and stability claims hold after additional validation, the work would supply web operators with a deployable, cross-layer attribution method that addresses limitations of robots.txt and simple bot blockers. The identification of persistent structural differences in agent HTTP/TLS and interaction patterns could support more targeted security policies as LLM-based agents become common.

major comments (3)

[Abstract and Evaluation] Abstract and Evaluation section: The claim of 97% accuracy on six frameworks is presented without any information on dataset size, trace collection methodology, cross-validation procedure, error bars, or statistical significance. This absence makes it impossible to evaluate whether the decision tree result supports the attribution and evasion-resistance conclusions.
[Experimental Evaluation] Experimental Evaluation: All traces are collected from unmodified default instances of the six frameworks. No tests alter configuration parameters, inject custom headers, change navigation policies, or evaluate updated framework releases. Because the central claim requires that the observed multi-layer differences remain reliable identifiers under reconfiguration, the current results do not substantiate the 'evasion-resistant' assertion.
[Methodology] Methodology: The manuscript does not enumerate the precise multi-layer feature set, the extraction code, or any feature-importance ranking from the decision tree. Without these details the reported accuracy cannot be reproduced or attributed to specific layers (TLS vs. browser actions).

minor comments (2)

[Abstract] The abstract lists framework names with inconsistent formatting (e.g., 'Browser Use' versus single-word names); standardize naming throughout.
[Implementation] No reference is given to the programmatic logging framework implementation or to any public artifact that would allow independent verification of the feature collection pipeline.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We agree that additional details on the experimental methodology, dataset, and features are needed to support the claims, and we will revise the manuscript to address these points. Below we respond to each major comment.

read point-by-point responses

Referee: [Abstract and Evaluation] Abstract and Evaluation section: The claim of 97% accuracy on six frameworks is presented without any information on dataset size, trace collection methodology, cross-validation procedure, error bars, or statistical significance. This absence makes it impossible to evaluate whether the decision tree result supports the attribution and evasion-resistance conclusions.

Authors: We agree that the current presentation lacks sufficient experimental details for proper evaluation. In the revised manuscript we will expand the Evaluation section (and update the abstract if space permits) to report the total number of traces collected per framework and baseline, the precise trace collection methodology and environment, the cross-validation procedure (including number of folds), accuracy with error bars or confidence intervals, and any statistical significance tests performed on the 97% result. revision: yes
Referee: [Experimental Evaluation] Experimental Evaluation: All traces are collected from unmodified default instances of the six frameworks. No tests alter configuration parameters, inject custom headers, change navigation policies, or evaluate updated framework releases. Because the central claim requires that the observed multi-layer differences remain reliable identifiers under reconfiguration, the current results do not substantiate the 'evasion-resistant' assertion.

Authors: The referee is correct that the experiments used only default configurations. While the multi-layer differences we observed arise from fundamental architectural choices in request assembly and browser control (which are not trivially reconfigurable without breaking core agent functionality), we acknowledge that the evasion-resistance claim would be stronger with explicit tests of modified settings. In revision we will add a dedicated limitations subsection discussing potential evasion vectors and, to the extent feasible with the existing trace collection infrastructure, include preliminary results on a small set of reconfigured instances. We will also tone down the 'evasion-resistant' phrasing to 'resistant to simple evasion under standard usage' where appropriate. revision: partial
Referee: [Methodology] Methodology: The manuscript does not enumerate the precise multi-layer feature set, the extraction code, or any feature-importance ranking from the decision tree. Without these details the reported accuracy cannot be reproduced or attributed to specific layers (TLS vs. browser actions).

Authors: We agree that the lack of feature-level detail hinders reproducibility. In the revised version we will add an explicit table or enumerated list of all multi-layer features (grouped by TLS handshake, HTTP request construction, and browser action sequences), describe the extraction logic in sufficient detail for reproduction, and report the feature-importance scores from the trained decision tree so readers can see the relative contribution of each layer. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical classification on observed features

full rationale

The paper reports an empirical ML result: multi-layer features (TLS/HTTP assembly, browser actions) are extracted from traces of six agent frameworks plus baselines, then fed to a decision tree yielding 97% accuracy. No equations, no fitted parameters renamed as predictions, no self-citations invoked for uniqueness theorems or ansatzes, and no reduction of the central claim to its own inputs by construction. The accuracy is a direct performance metric on the collected data distribution; the derivation chain is self-contained standard supervised learning.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Empirical ML classification study; relies on standard assumptions about feature stability and data representativeness rather than new theoretical constructs.

free parameters (1)

decision tree hyperparameters
Parameters of the classifier are fitted to the collected agent and baseline traffic data.

axioms (1)

domain assumption Observed differences in TLS/HTTP assembly and browser actions are stable identifiers for the tested agent frameworks
Invoked when claiming the features enable high-accuracy classification.

pith-pipeline@v0.9.1-grok · 5756 in / 1172 out tokens · 37073 ms · 2026-06-26T16:33:03.229852+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

81 extracted references · 3 canonical work pages

[1]

150+ ai agents statistics: What business leaders are betting on in 2026,

I. Pohrebniyak, “150+ ai agents statistics: What business leaders are betting on in 2026,” https://masterofcode.com/blog/ai-agent-statistics, 2026, [Accessed 02-05-2026]

2026
[2]

Who’s adopting ai agents—and what they’re actually doing with them,

J. Yang, “Who’s adopting ai agents—and what they’re actually doing with them,” https://www.library.hbs.edu/working-knowledge/whos-a dopting-ai-agents-and-what-theyre-actually-doing-with-them, 2026, [Accessed 12-06-2026]

2026
[3]

Robots Exclusion Protocol,

M. Koster, G. Illyes, H. Zeller, and L. Sassman, “Robots Exclusion Protocol,” RFC 9309, Sep. 2022. [Online]. Available: https://www.rfc-editor.org/info/rfc9309

2022
[4]

The odyssey of robots.txt governance: Measuring convention implications of web bots in Large Language Model services,

J. Cui, M. Zha, X. Wang, and X. Liao, “The odyssey of robots.txt governance: Measuring convention implications of web bots in Large Language Model services,” inProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 21–35. [Online]. Available: ht...

work page doi:10.1145/3719027.3765063 2025
[5]

Scrapers selectively respect robots.txt directives: Evidence from a large-scale empirical study,

T. Kim, K. Bock, C. Luo, A. Liswood, C. Poroslay, and E. Wenger, “Scrapers selectively respect robots.txt directives: Evidence from a large-scale empirical study,” inProceedings of the 2025 ACM Internet Measurement Conference, ser. IMC ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 541–557. [Online]. Available: https://doi.org/10.1...

work page doi:10.1145/3730567.3764471 2025
[6]

Somesite i used to crawl: Awareness, agency and efficacy in protecting content creators from AI crawlers,

E. Liu, E. Luo, S. Shan, G. M. V oelker, B. Y . Zhao, and S. Savage, “Somesite i used to crawl: Awareness, agency and efficacy in protecting content creators from AI crawlers,” inProceedings of the 2025 ACM Internet Measurement Conference, ser. IMC ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 78–99. [Online]. Available: https://d...

work page doi:10.1145/3730567.3732913 2025
[7]

Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives — blog.cloudflare.com,

G. Corral, V . Singhal, B. Mitchell, and R. Tatoris, “Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives — blog.cloudflare.com,” https://blog.cloudflare.com/perplexity-is-using -stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/, 2025, [Accessed 21-04-2026]

2025
[8]

Trapping misbehaving bots in an AI Labyrinth — blog.cloudflare.com,

R. Tatoris, H. Saxena, and L. Miglietti, “Trapping misbehaving bots in an AI Labyrinth — blog.cloudflare.com,” https://blog.cloudflare.c om/ai-labyrinth/, 2025, [Accessed 02-05-2026]

2025
[9]

Statistical identification of encrypted web browsing traf- fic,

Q. Sun, D. R. Simon, Y .-M. Wang, W. Russell, V . N. Padmanabhan, and L. Qiu, “Statistical identification of encrypted web browsing traf- fic,” inProceedings 2002 IEEE Symposium on Security and Privacy. IEEE, 2002, pp. 19–30

2002
[10]

Website fingerprinting at internet scale

A. Panchenko, F. Lanze, J. Pennekamp, T. Engel, A. Zinnen, M. Henze, and K. Wehrle, “Website fingerprinting at internet scale.” inNDSS, vol. 1, 2016, p. 23477

2016
[11]

k-fingerprinting: A robust scalable web- site fingerprinting technique,

J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable web- site fingerprinting technique,” in25th USENIX Security Symposium (USENIX Security 16), 2016, pp. 1187–1203

2016
[12]

Fp-stalker: Tracking browser fingerprint evolutions,

A. Vastel, P. Laperdrix, W. Rudametkin, and R. Rouvoy, “Fp-stalker: Tracking browser fingerprint evolutions,” in2018 IEEE Symposium on Security and Privacy (SP). IEEE, 2018, pp. 728–741

2018
[13]

How unique is your web browser?

P. Eckersley, “How unique is your web browser?” inInterna- tional Symposium on Privacy Enhancing Technologies Symposium. Springer, 2010, pp. 1–18

2010
[14]

Long-term observation on browser fingerprinting: Users’ trackability and per- spective,

G. Pugliese, C. Riess, F. Gassmann, and Z. Benenson, “Long-term observation on browser fingerprinting: Users’ trackability and per- spective,”Proceedings on Privacy Enhancing Technologies, 2020

2020
[15]

Tracking users on the internet with behavioral patterns: Evaluation of its practical feasibil- ity,

C. Banse, D. Herrmann, and H. Federrath, “Tracking users on the internet with behavioral patterns: Evaluation of its practical feasibil- ity,” inIFIP International Information Security Conference. Springer, 2012, pp. 235–248

2012
[16]

A novel attack to track users based on the behavior patterns,

X. Gu, M. Yang, C. Shi, Z. Ling, and J. Luo, “A novel attack to track users based on the behavior patterns,”Concurrency and Computation: Practice and Experience, vol. 29, no. 6, p. e3891, 2017

2017
[17]

Behavior-based track- ing: Exploiting characteristic patterns in dns traffic,

D. Herrmann, C. Banse, and H. Federrath, “Behavior-based track- ing: Exploiting characteristic patterns in dns traffic,”Computers & Security, vol. 39, pp. 17–33, 2013

2013
[18]

Web user behavioral profiling for user identification,

Y . C. Yang, “Web user behavioral profiling for user identification,” Decision Support Systems, vol. 49, no. 3, pp. 261–271, 2010

2010
[19]

Web page revisitation revisited: implications of a long-term click-stream study of browser usage,

H. Obendorf, H. Weinreich, E. Herder, and M. Mayer, “Web page revisitation revisited: implications of a long-term click-stream study of browser usage,” inProceedings of the SIGCHI conference on Human factors in computing systems, 2007, pp. 597–606

2007
[20]

Browsing unicity: On the limits of anonymizing web tracking data,

C. Deußer, S. Passmann, and T. Strufe, “Browsing unicity: On the limits of anonymizing web tracking data,” in2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020, pp. 777–790

2020
[21]

Fingerprint Launches Automation Intelligence API and AI Assistant Detection, Delivering the Industry’s Most Complete View of AI Traffic,

Fingerprint, “Fingerprint Launches Automation Intelligence API and AI Assistant Detection, Delivering the Industry’s Most Complete View of AI Traffic,” https://www.businesswire.com/news/home/2 0260601158287/en/Fingerprint-Launches-Automation-Intelligenc e-API-and-AI-Assistant-Detection-Delivering-the-Industrys-Most-C omplete-View-of-AI-Traffic, 2026, [Acc...

2026
[22]

Skyvern: Automate browser-based workflows with AI,

Skyvern-AI, “Skyvern: Automate browser-based workflows with AI,” https://github.com/Skyvern-AI/skyvern, 2026, accessed: 2026-05-01

2026
[23]

Browser Use: Enable AI to control your browser,

M. M ¨uller and G. ˇZuniˇc, “Browser Use: Enable AI to control your browser,” https://github.com/browser-use/browser-use, 2024

2024
[24]

AutoGen: Enabling next-gen LLM applications via multi- agent conversations,

Q. Wu, G. Bansal, J. Zhang, Y . Wu, B. Li, E. Zhu, L. Jiang, X. Zhang, S. Zhang, J. Liu, A. H. Awadallah, R. W. White, D. Burger, and C. Wang, “AutoGen: Enabling next-gen LLM applications via multi- agent conversations,” inCOLM, 2024

2024
[25]

Introducing Operator,

OpenAI, “Introducing Operator,” https://openai.com/index/introduci ng-operator/, Jan. 2025, accessed: 2026-05-01

2025
[26]

Computer use tool,

Anthropic, “Computer use tool,” https://platform.claude.com/docs/e n/agents-and-tools/tool-use/computer-use-tool, 2026, claude API documentation. Accessed: 2026-05-01

2026
[27]

Computer use,

Google, “Computer use,” https://ai.google.dev/gemini-api/docs/comp uter-use, 2026, gemini API documentation. Last updated: 2026-04-

2026
[28]

Accessed: 2026-05-01

2026
[29]

Apache Nutch,

A. N. P. M. Committee, “Apache Nutch,” https://nutch.apache.org/, [Accessed 11-06-2026]

2026
[30]

Heritrix,

internetarchive/heritrix3, “Heritrix,” https://github.com/internetarchi ve/heritrix3, [Accessed 11-06-2026]

2026
[31]

Scrapy — open source web scraping framework for Python,

Scrapy, “Scrapy — open source web scraping framework for Python,” https://www.scrapy.org/, [Accessed 11-06-2026]

2026
[32]

Breaking agent backbones: Evaluating the security of backbone LLMs in AI agents,

J. Bazinska, M. Mathys, F. Casucci, M. Rojas-Carulla, X. Davies, A. Souly, and N. Pfister, “Breaking agent backbones: Evaluating the security of backbone LLMs in AI agents,” inThe Fourteenth International Conference on Learning Representations, 2026

2026
[33]

ReAct: Synergizing reasoning and acting in language models,

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” in ICLR, 2023

2023
[34]

V oyager: An open-ended embodied agent with Large Language Models,

G. Wang, Y . Xie, Y . Jiang, A. Mandlekar, C. Xiao, Y . Zhu, L. Fan, and A. Anandkumar, “V oyager: An open-ended embodied agent with Large Language Models,”Transactions on Machine Learning Research, 2024. [Online]. Available: https://openreview.net/forum?i d=ehfRiF0R3a

2024
[35]

OmniTool: Computer use with OmniParser,

Microsoft, “OmniTool: Computer use with OmniParser,” https://gith ub.com/microsoft/OmniParser/blob/master/omnitool/readme.md, 2025, accessed: 2026-05-01

2025
[36]

SWE-agent: Agent-computer interfaces enable automated software engineering,

J. Yang, C. E. Jimenez, A. Wettig, K. Lieret, S. Yao, K. R. Narasimhan, and O. Press, “SWE-agent: Agent-computer interfaces enable automated software engineering,” inNeurIPS, 2024, pp. 50 528–50 652. [Online]. Available: https://arxiv.org/abs/2405.15793

Pith/arXiv arXiv 2024
[37]

Build software with AI agents,

Cursor, “Build software with AI agents,” https://cursor.com/product, 2026, accessed: 2026-05-01

2026
[38]

Accessed: 2026-05-01

OpenAI, “Codex,” https://developers.openai.com/codex, 2026, openAI Developers documentation. Accessed: 2026-05-01

2026
[39]

Claude Code overview,

Anthropic, “Claude Code overview,” https://code.claude.com/docs/en/ overview, 2026, claude Code documentation. Accessed: 2026-05-01

2026
[40]

Introducing devin, the first AI software engineer,

Cognition, “Introducing devin, the first AI software engineer,” https: //cognition.ai/blog/introducing-devin, Mar. 2024, accessed: 2026-05- 01

2024
[41]

GPT researcher,

A. Elovic, “GPT researcher,” code repository: https://github.com/ass afelovic/gpt-researcher. [Online]. Available: https://gptr.dev
[42]

Introducing Deep Research,

OpenAI, “Introducing Deep Research,” https://openai.com/index/int roducing-deep-research/, 2025

2025
[43]

Gemini Deep Research — your personal research assistant,

Google, “Gemini Deep Research — your personal research assistant,” https://gemini.google/overview/deep-research/, 2025, accessed: 2025- 07-16

2025
[44]

Are AI agents interacting with online ads?

A. St ¨ockl and J. Nitu, “Are AI agents interacting with online ads?” arXiv preprint arXiv:2504.07112, 2025

arXiv 2025
[45]

How Skyvern reads and understands the web,

S. Singh, “How Skyvern reads and understands the web,” https://ww w.skyvern.com/blog/how-skyvern-reads-and-understands-the-web/, Jul. 2025, accessed: 2026-05-01

2025
[46]

Computer-using agent,

OpenAI, “Computer-using agent,” https://openai.com/index/compute r-using-agent/, Jan. 2025, accessed: 2026-05-01

2025
[47]

HTTPS traffic anal- ysis and client identification using passive SSL/TLS fingerprinting,

M. Hus ´ak, M. ˇCerm´ak, T. Jirs´ık, and P. ˇCeleda, “HTTPS traffic anal- ysis and client identification using passive SSL/TLS fingerprinting,” EURASIP Journal on Information Security, vol. 2016, p. 6, 2016

2016
[48]

TLS fingerprinting with JA3 and JA3S,

J. Althouse, J. Atkinson, and J. Atkins, “TLS fingerprinting with JA3 and JA3S,” Salesforce Engineering Blog, Jan. 2019, https://engineer ing.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855 967/

2019
[49]

JA4+ network fingerprinting,

J. Althouse, “JA4+ network fingerprinting,” FoxIO Blog, Sep. 2023, https://foxio.io/blog/ja4-network-fingerprinting

2023
[50]

The use of TLS in censorship circum- vention,

S. Frolov and E. Wustrow, “The use of TLS in censorship circum- vention,” inProceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, 2019

2019
[51]

TLS beyond the browser: Combin- ing end host and network data to understand application behavior,

B. Anderson and D. A. McGrew, “TLS beyond the browser: Combin- ing end host and network data to understand application behavior,” inProceedings of the Internet Measurement Conference (IMC ’19). ACM, 2019, pp. 379–392

2019
[52]

Passive fingerprinting of HTTP/2 clients,

O. Segal, A. Fridman, and E. Shuster, “Passive fingerprinting of HTTP/2 clients,” Akamai Technologies White Paper, 2017, presented at Black Hat Europe 2017. https://www.blackhat.com/docs/eu-17/ma terials/eu-17-Shuster-Passive-Fingerprinting-Of-HTTP2-Clients-wp. pdf

2017
[53]

Good bot, bad bot: Characterizing automated browsing activity,

X. Li, B. Amin Azad, A. Rahmati, and N. Nikiforakis, “Good bot, bad bot: Characterizing automated browsing activity,” in2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021, pp. 1589– 1605

2021
[54]

When handshakes tell the truth: Detecting web bad bots via TLS fingerprints,

G. Jarad and K. Bicakci, “When handshakes tell the truth: Detecting web bad bots via TLS fingerprints,” 2026. [Online]. Available: https://arxiv.org/abs/2602.09606

arXiv 2026
[55]

Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interactions,

Y . Zhang, X. Deng, Z. Gu, Y . Chen, K. Xu, Q. Li, and J. Wu, “Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interactions,” 2025. [Online]. Available: https://arxiv.org/abs/2510.07176

arXiv 2025
[56]

Tracked without a trace: linking sessions of users by unsupervised learning of patterns in their dns traffic,

M. Kirchler, D. Herrmann, J. Lindemann, and M. Kloft, “Tracked without a trace: linking sessions of users by unsupervised learning of patterns in their dns traffic,” inProceedings of the 2016 ACM workshop on artificial intelligence and security, 2016, pp. 23–34

2016
[57]

Users’ fingerprinting techniques from tcp traffic,

L. Vassio, D. Giordano, M. Trevisan, M. Mellia, and A. P. C. da Silva, “Users’ fingerprinting techniques from tcp traffic,” inProceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, 2017, pp. 49–54

2017
[58]

Rethinking fingerprinting: An assessment of behavior-based methods at scale and implications for web tracking,

K. Crichton, L. F. Cranor, and N. Christin, “Rethinking fingerprinting: An assessment of behavior-based methods at scale and implications for web tracking,”Proceedings on Privacy Enhancing Technologies, 2025

2025
[59]

Fp-agent: Fingerprinting ai browsing agents,

E. Wang, Z. Shafiq, and Y . Vekaria, “Fp-agent: Fingerprinting ai browsing agents,” 2026. [Online]. Available: https://arxiv.org/abs/26 05.01247

2026
[60]

About Let’s Encrypt,

L. Encrypt, “About Let’s Encrypt,” https://letsencrypt.org/about/, 2021, [Accessed 11-06-2026]

2021
[61]

Emer- gence WebV oyager: Toward consistent and transparent evaluation of (Web) Agents in the wild,

D. Akkil, M. Allaham, A. Raj, T. Abuelsaad, and R. Kokku, “Emer- gence WebV oyager: Toward consistent and transparent evaluation of (Web) Agents in the wild,”arXiv preprint arXiv:2603.29020, 2026

arXiv 2026
[62]

What Is Claude Code Computer Use? How to Control Your Desktop with AI — mindstudio.ai,

M. Team, “What Is Claude Code Computer Use? How to Control Your Desktop with AI — mindstudio.ai,” https://www.mindstudio.a i/blog/what-is-claude-code-computer-use, 2026, [Accessed 12-06- 2026]

2026
[63]

How Gemini 2.5 Computer Use Lets AI Control Web Interfaces (Safely and Smartly),

SiderAI, “How Gemini 2.5 Computer Use Lets AI Control Web Interfaces (Safely and Smartly),” https://sider.ai/blog/ai-tools/how -gemini-2 5-computer-use-lets-ai-control-web-interfaces-safely-and -smartly, 2025, [Accessed 12-06-2026]

2025
[64]

Introducing Operator,

OpenAI, “Introducing Operator,” https://openai.com/index/introduci ng-operator/, Jan. 2025, [Accessed 11-06-2026]

2025
[65]

Openai operator explained: How ai agents actually control the web,

I. Raman, “Openai operator explained: How ai agents actually control the web,” https://anchorbrowser.io/blog/how-openai-operator-works -with-ai-agents, [Accessed 12-06-2026]

2026
[66]

A new path for kyber on the web,

D. Adrian, D. Benjamin, B. Beck, and D. O’Brien, “A new path for kyber on the web,” https://security.googleblog.com/2024/09/a-new-p ath-for-kyber-on-web.html, 2024, [Accessed 12-06-2026]

2024
[67]

Hypertext Transfer Protocol Version 2 (HTTP/2),

M. Belshe, R. Peon, and M. Thomson, “Hypertext Transfer Protocol Version 2 (HTTP/2),” RFC 7540, May 2015, obsoleted by RFC 9113. [Accessed 27-05-2026]. [Online]. Available: https: //www.rfc-editor.org/info/rfc7540

2015
[68]

Extensible Prioritization Scheme for HTTP,

K. Oku and L. Pardue, “Extensible Prioritization Scheme for HTTP,” RFC 9218, Jun. 2022, [Accessed 27-05-2026]. [Online]. Available: https://www.rfc-editor.org/info/rfc9218 Appendix A. Measurement Setup for Agents Figure 8 shows the task execution prompt for fingerprint measurement of web agents as discussed in Section 4.1. Appendix B. User Behavior Testbe...

2022
[69]

https://<testbed domain>/subscribe-v1.html
[70]

https://<testbed domain>/subscribe-v2.html
[71]

https://<testbed domain>/subscribe-v3.html
[72]

https://<testbed domain>/s2-scroll-gate.html
[73]

https://<testbed domain>/s3-hover-reveal.html
[74]

https://<testbed domain>/s4-dom-mismatch.html
[75]

https://<testbed domain>/s5-delayed-feedback.html For each target URL:
[76]

Navigate directly to the exact URL
[77]

Once the page loads, inspect the page and decide what a reasonable user would naturally do
[78]

- A sequence may contain multiple low-level browser actions if they naturally belong together

Perform one short natural interaction sequence. - A sequence may contain multiple low-level browser actions if they naturally belong together. - For example, typing into a field and pressing a nearby submit button may count as one sequence. - Choose synthetic input values yourself. - Do not use real personal information. - Do not ask the user what value to enter
[79]

Immediately navigate directly to the next target URL

After you perform any meaningful page interaction sequence, the current URL is considered complete. Immediately navigate directly to the next target URL
[80]

Instead, navigate directly to the next unvisited target URL

If you are unsure what to do next, do not ask the user. Instead, navigate directly to the next unvisited target URL

Showing first 80 references.

[1] [1]

150+ ai agents statistics: What business leaders are betting on in 2026,

I. Pohrebniyak, “150+ ai agents statistics: What business leaders are betting on in 2026,” https://masterofcode.com/blog/ai-agent-statistics, 2026, [Accessed 02-05-2026]

2026

[2] [2]

Who’s adopting ai agents—and what they’re actually doing with them,

J. Yang, “Who’s adopting ai agents—and what they’re actually doing with them,” https://www.library.hbs.edu/working-knowledge/whos-a dopting-ai-agents-and-what-theyre-actually-doing-with-them, 2026, [Accessed 12-06-2026]

2026

[3] [3]

Robots Exclusion Protocol,

M. Koster, G. Illyes, H. Zeller, and L. Sassman, “Robots Exclusion Protocol,” RFC 9309, Sep. 2022. [Online]. Available: https://www.rfc-editor.org/info/rfc9309

2022

[4] [4]

The odyssey of robots.txt governance: Measuring convention implications of web bots in Large Language Model services,

J. Cui, M. Zha, X. Wang, and X. Liao, “The odyssey of robots.txt governance: Measuring convention implications of web bots in Large Language Model services,” inProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 21–35. [Online]. Available: ht...

work page doi:10.1145/3719027.3765063 2025

[5] [5]

Scrapers selectively respect robots.txt directives: Evidence from a large-scale empirical study,

T. Kim, K. Bock, C. Luo, A. Liswood, C. Poroslay, and E. Wenger, “Scrapers selectively respect robots.txt directives: Evidence from a large-scale empirical study,” inProceedings of the 2025 ACM Internet Measurement Conference, ser. IMC ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 541–557. [Online]. Available: https://doi.org/10.1...

work page doi:10.1145/3730567.3764471 2025

[6] [6]

Somesite i used to crawl: Awareness, agency and efficacy in protecting content creators from AI crawlers,

E. Liu, E. Luo, S. Shan, G. M. V oelker, B. Y . Zhao, and S. Savage, “Somesite i used to crawl: Awareness, agency and efficacy in protecting content creators from AI crawlers,” inProceedings of the 2025 ACM Internet Measurement Conference, ser. IMC ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 78–99. [Online]. Available: https://d...

work page doi:10.1145/3730567.3732913 2025

[7] [7]

Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives — blog.cloudflare.com,

G. Corral, V . Singhal, B. Mitchell, and R. Tatoris, “Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives — blog.cloudflare.com,” https://blog.cloudflare.com/perplexity-is-using -stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/, 2025, [Accessed 21-04-2026]

2025

[8] [8]

Trapping misbehaving bots in an AI Labyrinth — blog.cloudflare.com,

R. Tatoris, H. Saxena, and L. Miglietti, “Trapping misbehaving bots in an AI Labyrinth — blog.cloudflare.com,” https://blog.cloudflare.c om/ai-labyrinth/, 2025, [Accessed 02-05-2026]

2025

[9] [9]

Statistical identification of encrypted web browsing traf- fic,

Q. Sun, D. R. Simon, Y .-M. Wang, W. Russell, V . N. Padmanabhan, and L. Qiu, “Statistical identification of encrypted web browsing traf- fic,” inProceedings 2002 IEEE Symposium on Security and Privacy. IEEE, 2002, pp. 19–30

2002

[10] [10]

Website fingerprinting at internet scale

A. Panchenko, F. Lanze, J. Pennekamp, T. Engel, A. Zinnen, M. Henze, and K. Wehrle, “Website fingerprinting at internet scale.” inNDSS, vol. 1, 2016, p. 23477

2016

[11] [11]

k-fingerprinting: A robust scalable web- site fingerprinting technique,

J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable web- site fingerprinting technique,” in25th USENIX Security Symposium (USENIX Security 16), 2016, pp. 1187–1203

2016

[12] [12]

Fp-stalker: Tracking browser fingerprint evolutions,

A. Vastel, P. Laperdrix, W. Rudametkin, and R. Rouvoy, “Fp-stalker: Tracking browser fingerprint evolutions,” in2018 IEEE Symposium on Security and Privacy (SP). IEEE, 2018, pp. 728–741

2018

[13] [13]

How unique is your web browser?

P. Eckersley, “How unique is your web browser?” inInterna- tional Symposium on Privacy Enhancing Technologies Symposium. Springer, 2010, pp. 1–18

2010

[14] [14]

Long-term observation on browser fingerprinting: Users’ trackability and per- spective,

G. Pugliese, C. Riess, F. Gassmann, and Z. Benenson, “Long-term observation on browser fingerprinting: Users’ trackability and per- spective,”Proceedings on Privacy Enhancing Technologies, 2020

2020

[15] [15]

Tracking users on the internet with behavioral patterns: Evaluation of its practical feasibil- ity,

C. Banse, D. Herrmann, and H. Federrath, “Tracking users on the internet with behavioral patterns: Evaluation of its practical feasibil- ity,” inIFIP International Information Security Conference. Springer, 2012, pp. 235–248

2012

[16] [16]

A novel attack to track users based on the behavior patterns,

X. Gu, M. Yang, C. Shi, Z. Ling, and J. Luo, “A novel attack to track users based on the behavior patterns,”Concurrency and Computation: Practice and Experience, vol. 29, no. 6, p. e3891, 2017

2017

[17] [17]

Behavior-based track- ing: Exploiting characteristic patterns in dns traffic,

D. Herrmann, C. Banse, and H. Federrath, “Behavior-based track- ing: Exploiting characteristic patterns in dns traffic,”Computers & Security, vol. 39, pp. 17–33, 2013

2013

[18] [18]

Web user behavioral profiling for user identification,

Y . C. Yang, “Web user behavioral profiling for user identification,” Decision Support Systems, vol. 49, no. 3, pp. 261–271, 2010

2010

[19] [19]

Web page revisitation revisited: implications of a long-term click-stream study of browser usage,

H. Obendorf, H. Weinreich, E. Herder, and M. Mayer, “Web page revisitation revisited: implications of a long-term click-stream study of browser usage,” inProceedings of the SIGCHI conference on Human factors in computing systems, 2007, pp. 597–606

2007

[20] [20]

Browsing unicity: On the limits of anonymizing web tracking data,

C. Deußer, S. Passmann, and T. Strufe, “Browsing unicity: On the limits of anonymizing web tracking data,” in2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020, pp. 777–790

2020

[21] [21]

Fingerprint Launches Automation Intelligence API and AI Assistant Detection, Delivering the Industry’s Most Complete View of AI Traffic,

Fingerprint, “Fingerprint Launches Automation Intelligence API and AI Assistant Detection, Delivering the Industry’s Most Complete View of AI Traffic,” https://www.businesswire.com/news/home/2 0260601158287/en/Fingerprint-Launches-Automation-Intelligenc e-API-and-AI-Assistant-Detection-Delivering-the-Industrys-Most-C omplete-View-of-AI-Traffic, 2026, [Acc...

2026

[22] [22]

Skyvern: Automate browser-based workflows with AI,

Skyvern-AI, “Skyvern: Automate browser-based workflows with AI,” https://github.com/Skyvern-AI/skyvern, 2026, accessed: 2026-05-01

2026

[23] [23]

Browser Use: Enable AI to control your browser,

M. M ¨uller and G. ˇZuniˇc, “Browser Use: Enable AI to control your browser,” https://github.com/browser-use/browser-use, 2024

2024

[24] [24]

AutoGen: Enabling next-gen LLM applications via multi- agent conversations,

Q. Wu, G. Bansal, J. Zhang, Y . Wu, B. Li, E. Zhu, L. Jiang, X. Zhang, S. Zhang, J. Liu, A. H. Awadallah, R. W. White, D. Burger, and C. Wang, “AutoGen: Enabling next-gen LLM applications via multi- agent conversations,” inCOLM, 2024

2024

[25] [25]

Introducing Operator,

OpenAI, “Introducing Operator,” https://openai.com/index/introduci ng-operator/, Jan. 2025, accessed: 2026-05-01

2025

[26] [26]

Computer use tool,

Anthropic, “Computer use tool,” https://platform.claude.com/docs/e n/agents-and-tools/tool-use/computer-use-tool, 2026, claude API documentation. Accessed: 2026-05-01

2026

[27] [27]

Computer use,

Google, “Computer use,” https://ai.google.dev/gemini-api/docs/comp uter-use, 2026, gemini API documentation. Last updated: 2026-04-

2026

[28] [28]

Accessed: 2026-05-01

2026

[29] [29]

Apache Nutch,

A. N. P. M. Committee, “Apache Nutch,” https://nutch.apache.org/, [Accessed 11-06-2026]

2026

[30] [30]

Heritrix,

internetarchive/heritrix3, “Heritrix,” https://github.com/internetarchi ve/heritrix3, [Accessed 11-06-2026]

2026

[31] [31]

Scrapy — open source web scraping framework for Python,

Scrapy, “Scrapy — open source web scraping framework for Python,” https://www.scrapy.org/, [Accessed 11-06-2026]

2026

[32] [32]

Breaking agent backbones: Evaluating the security of backbone LLMs in AI agents,

J. Bazinska, M. Mathys, F. Casucci, M. Rojas-Carulla, X. Davies, A. Souly, and N. Pfister, “Breaking agent backbones: Evaluating the security of backbone LLMs in AI agents,” inThe Fourteenth International Conference on Learning Representations, 2026

2026

[33] [33]

ReAct: Synergizing reasoning and acting in language models,

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” in ICLR, 2023

2023

[34] [34]

V oyager: An open-ended embodied agent with Large Language Models,

G. Wang, Y . Xie, Y . Jiang, A. Mandlekar, C. Xiao, Y . Zhu, L. Fan, and A. Anandkumar, “V oyager: An open-ended embodied agent with Large Language Models,”Transactions on Machine Learning Research, 2024. [Online]. Available: https://openreview.net/forum?i d=ehfRiF0R3a

2024

[35] [35]

OmniTool: Computer use with OmniParser,

Microsoft, “OmniTool: Computer use with OmniParser,” https://gith ub.com/microsoft/OmniParser/blob/master/omnitool/readme.md, 2025, accessed: 2026-05-01

2025

[36] [36]

SWE-agent: Agent-computer interfaces enable automated software engineering,

J. Yang, C. E. Jimenez, A. Wettig, K. Lieret, S. Yao, K. R. Narasimhan, and O. Press, “SWE-agent: Agent-computer interfaces enable automated software engineering,” inNeurIPS, 2024, pp. 50 528–50 652. [Online]. Available: https://arxiv.org/abs/2405.15793

Pith/arXiv arXiv 2024

[37] [37]

Build software with AI agents,

Cursor, “Build software with AI agents,” https://cursor.com/product, 2026, accessed: 2026-05-01

2026

[38] [38]

Accessed: 2026-05-01

OpenAI, “Codex,” https://developers.openai.com/codex, 2026, openAI Developers documentation. Accessed: 2026-05-01

2026

[39] [39]

Claude Code overview,

Anthropic, “Claude Code overview,” https://code.claude.com/docs/en/ overview, 2026, claude Code documentation. Accessed: 2026-05-01

2026

[40] [40]

Introducing devin, the first AI software engineer,

Cognition, “Introducing devin, the first AI software engineer,” https: //cognition.ai/blog/introducing-devin, Mar. 2024, accessed: 2026-05- 01

2024

[41] [41]

GPT researcher,

A. Elovic, “GPT researcher,” code repository: https://github.com/ass afelovic/gpt-researcher. [Online]. Available: https://gptr.dev

[42] [42]

Introducing Deep Research,

OpenAI, “Introducing Deep Research,” https://openai.com/index/int roducing-deep-research/, 2025

2025

[43] [43]

Gemini Deep Research — your personal research assistant,

Google, “Gemini Deep Research — your personal research assistant,” https://gemini.google/overview/deep-research/, 2025, accessed: 2025- 07-16

2025

[44] [44]

Are AI agents interacting with online ads?

A. St ¨ockl and J. Nitu, “Are AI agents interacting with online ads?” arXiv preprint arXiv:2504.07112, 2025

arXiv 2025

[45] [45]

How Skyvern reads and understands the web,

S. Singh, “How Skyvern reads and understands the web,” https://ww w.skyvern.com/blog/how-skyvern-reads-and-understands-the-web/, Jul. 2025, accessed: 2026-05-01

2025

[46] [46]

Computer-using agent,

OpenAI, “Computer-using agent,” https://openai.com/index/compute r-using-agent/, Jan. 2025, accessed: 2026-05-01

2025

[47] [47]

HTTPS traffic anal- ysis and client identification using passive SSL/TLS fingerprinting,

M. Hus ´ak, M. ˇCerm´ak, T. Jirs´ık, and P. ˇCeleda, “HTTPS traffic anal- ysis and client identification using passive SSL/TLS fingerprinting,” EURASIP Journal on Information Security, vol. 2016, p. 6, 2016

2016

[48] [48]

TLS fingerprinting with JA3 and JA3S,

J. Althouse, J. Atkinson, and J. Atkins, “TLS fingerprinting with JA3 and JA3S,” Salesforce Engineering Blog, Jan. 2019, https://engineer ing.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855 967/

2019

[49] [49]

JA4+ network fingerprinting,

J. Althouse, “JA4+ network fingerprinting,” FoxIO Blog, Sep. 2023, https://foxio.io/blog/ja4-network-fingerprinting

2023

[50] [50]

The use of TLS in censorship circum- vention,

S. Frolov and E. Wustrow, “The use of TLS in censorship circum- vention,” inProceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, 2019

2019

[51] [51]

TLS beyond the browser: Combin- ing end host and network data to understand application behavior,

B. Anderson and D. A. McGrew, “TLS beyond the browser: Combin- ing end host and network data to understand application behavior,” inProceedings of the Internet Measurement Conference (IMC ’19). ACM, 2019, pp. 379–392

2019

[52] [52]

Passive fingerprinting of HTTP/2 clients,

O. Segal, A. Fridman, and E. Shuster, “Passive fingerprinting of HTTP/2 clients,” Akamai Technologies White Paper, 2017, presented at Black Hat Europe 2017. https://www.blackhat.com/docs/eu-17/ma terials/eu-17-Shuster-Passive-Fingerprinting-Of-HTTP2-Clients-wp. pdf

2017

[53] [53]

Good bot, bad bot: Characterizing automated browsing activity,

X. Li, B. Amin Azad, A. Rahmati, and N. Nikiforakis, “Good bot, bad bot: Characterizing automated browsing activity,” in2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021, pp. 1589– 1605

2021

[54] [54]

When handshakes tell the truth: Detecting web bad bots via TLS fingerprints,

G. Jarad and K. Bicakci, “When handshakes tell the truth: Detecting web bad bots via TLS fingerprints,” 2026. [Online]. Available: https://arxiv.org/abs/2602.09606

arXiv 2026

[55] [55]

Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interactions,

Y . Zhang, X. Deng, Z. Gu, Y . Chen, K. Xu, Q. Li, and J. Wu, “Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interactions,” 2025. [Online]. Available: https://arxiv.org/abs/2510.07176

arXiv 2025

[56] [56]

Tracked without a trace: linking sessions of users by unsupervised learning of patterns in their dns traffic,

M. Kirchler, D. Herrmann, J. Lindemann, and M. Kloft, “Tracked without a trace: linking sessions of users by unsupervised learning of patterns in their dns traffic,” inProceedings of the 2016 ACM workshop on artificial intelligence and security, 2016, pp. 23–34

2016

[57] [57]

Users’ fingerprinting techniques from tcp traffic,

L. Vassio, D. Giordano, M. Trevisan, M. Mellia, and A. P. C. da Silva, “Users’ fingerprinting techniques from tcp traffic,” inProceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, 2017, pp. 49–54

2017

[58] [58]

Rethinking fingerprinting: An assessment of behavior-based methods at scale and implications for web tracking,

K. Crichton, L. F. Cranor, and N. Christin, “Rethinking fingerprinting: An assessment of behavior-based methods at scale and implications for web tracking,”Proceedings on Privacy Enhancing Technologies, 2025

2025

[59] [59]

Fp-agent: Fingerprinting ai browsing agents,

E. Wang, Z. Shafiq, and Y . Vekaria, “Fp-agent: Fingerprinting ai browsing agents,” 2026. [Online]. Available: https://arxiv.org/abs/26 05.01247

2026

[60] [60]

About Let’s Encrypt,

L. Encrypt, “About Let’s Encrypt,” https://letsencrypt.org/about/, 2021, [Accessed 11-06-2026]

2021

[61] [61]

Emer- gence WebV oyager: Toward consistent and transparent evaluation of (Web) Agents in the wild,

D. Akkil, M. Allaham, A. Raj, T. Abuelsaad, and R. Kokku, “Emer- gence WebV oyager: Toward consistent and transparent evaluation of (Web) Agents in the wild,”arXiv preprint arXiv:2603.29020, 2026

arXiv 2026

[62] [62]

What Is Claude Code Computer Use? How to Control Your Desktop with AI — mindstudio.ai,

M. Team, “What Is Claude Code Computer Use? How to Control Your Desktop with AI — mindstudio.ai,” https://www.mindstudio.a i/blog/what-is-claude-code-computer-use, 2026, [Accessed 12-06- 2026]

2026

[63] [63]

How Gemini 2.5 Computer Use Lets AI Control Web Interfaces (Safely and Smartly),

SiderAI, “How Gemini 2.5 Computer Use Lets AI Control Web Interfaces (Safely and Smartly),” https://sider.ai/blog/ai-tools/how -gemini-2 5-computer-use-lets-ai-control-web-interfaces-safely-and -smartly, 2025, [Accessed 12-06-2026]

2025

[64] [64]

Introducing Operator,

OpenAI, “Introducing Operator,” https://openai.com/index/introduci ng-operator/, Jan. 2025, [Accessed 11-06-2026]

2025

[65] [65]

Openai operator explained: How ai agents actually control the web,

I. Raman, “Openai operator explained: How ai agents actually control the web,” https://anchorbrowser.io/blog/how-openai-operator-works -with-ai-agents, [Accessed 12-06-2026]

2026

[66] [66]

A new path for kyber on the web,

D. Adrian, D. Benjamin, B. Beck, and D. O’Brien, “A new path for kyber on the web,” https://security.googleblog.com/2024/09/a-new-p ath-for-kyber-on-web.html, 2024, [Accessed 12-06-2026]

2024

[67] [67]

Hypertext Transfer Protocol Version 2 (HTTP/2),

M. Belshe, R. Peon, and M. Thomson, “Hypertext Transfer Protocol Version 2 (HTTP/2),” RFC 7540, May 2015, obsoleted by RFC 9113. [Accessed 27-05-2026]. [Online]. Available: https: //www.rfc-editor.org/info/rfc7540

2015

[68] [68]

Extensible Prioritization Scheme for HTTP,

K. Oku and L. Pardue, “Extensible Prioritization Scheme for HTTP,” RFC 9218, Jun. 2022, [Accessed 27-05-2026]. [Online]. Available: https://www.rfc-editor.org/info/rfc9218 Appendix A. Measurement Setup for Agents Figure 8 shows the task execution prompt for fingerprint measurement of web agents as discussed in Section 4.1. Appendix B. User Behavior Testbe...

2022

[69] [69]

https://<testbed domain>/subscribe-v1.html

[70] [70]

https://<testbed domain>/subscribe-v2.html

[71] [71]

https://<testbed domain>/subscribe-v3.html

[72] [72]

https://<testbed domain>/s2-scroll-gate.html

[73] [73]

https://<testbed domain>/s3-hover-reveal.html

[74] [74]

https://<testbed domain>/s4-dom-mismatch.html

[75] [75]

https://<testbed domain>/s5-delayed-feedback.html For each target URL:

[76] [76]

Navigate directly to the exact URL

[77] [77]

Once the page loads, inspect the page and decide what a reasonable user would naturally do

[78] [78]

- A sequence may contain multiple low-level browser actions if they naturally belong together

Perform one short natural interaction sequence. - A sequence may contain multiple low-level browser actions if they naturally belong together. - For example, typing into a field and pressing a nearby submit button may count as one sequence. - Choose synthetic input values yourself. - Do not use real personal information. - Do not ask the user what value to enter

[79] [79]

Immediately navigate directly to the next target URL

After you perform any meaningful page interaction sequence, the current URL is considered complete. Immediately navigate directly to the next target URL

[80] [80]

Instead, navigate directly to the next unvisited target URL

If you are unsure what to do next, do not ask the user. Instead, navigate directly to the next unvisited target URL