pith. machine review for the scientific record. sign in

arxiv: 2605.11606 · v1 · submitted 2026-05-12 · 💻 cs.CR · cs.NI

Recognition: no theorem link

Convolutional-Neural-Networks for Deanonymisation of I2P Traffic

Dieter Arnold, Konrad Baechler, Luca Rohrer

Pith reviewed 2026-05-13 01:28 UTC · model grok-4.3

classification 💻 cs.CR cs.NI
keywords I2P networkdeanonymizationconvolutional neural networkstraffic analysisanonymitymix networkspassive attacksmachine learning
0
0 comments X

The pith

Machine learning models fail to deanonymize I2P services from encrypted traffic patterns.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether convolutional neural networks can identify specific services or users inside the I2P anonymity network by examining only the observable flow of encrypted packets. Researchers first built a controlled laboratory setup that generates synthetic I2P traffic to create labeled training data, then trained CNN models on timing, size, and direction features of those flows. They supplemented the experiments with a theoretical analysis that applies Fano's inequality to bound how much identifying information can leak through any mix network of this type. When the same models were run on both the synthetic data and separate real-world I2P traces, classification accuracy stayed too low to achieve reliable deanonymization. The outcome indicates that I2P's existing protections continue to hold against this class of passive machine-learning attack.

Core claim

The authors show that convolutional neural networks applied to passive I2P traffic metadata cannot reliably map flows back to individual services or destinations. A laboratory environment supplied synthetic traffic for model training, while Fano's inequality supplied a theoretical upper bound on the information that can be extracted from anonymous mix-network transmissions. Real-world validation runs confirmed that the trained models produced no practically useful identifications, leaving the network's anonymity guarantees intact.

What carries the argument

Convolutional neural networks trained on packet timing, size, and direction sequences from I2P connections, together with Fano's inequality applied to quantify the maximum information leakage possible in mix networks.

If this is right

  • I2P users retain protection against passive traffic-analysis attacks that rely on current deep-learning classifiers.
  • Other anonymity systems built on similar mixing principles are likely to resist the same style of CNN-based identification.
  • Security assessments of anonymous networks should routinely include CNN experiments on synthetic traffic to check for hidden patterns.
  • Fano's inequality offers a reusable theoretical tool for placing quantitative limits on leakage in any mix-network design.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Designers of future anonymity overlays may need to monitor advances in traffic-classification models and adjust mixing parameters accordingly.
  • Similar CNN resistance tests could be applied to Tor or other low-latency anonymity systems to compare their relative robustness.
  • Real deployments would benefit from ongoing collection of fresh traffic traces to keep synthetic training distributions aligned with live usage.

Load-bearing premise

The synthetic traffic created in the laboratory captures the statistical features that real I2P users produce, and the CNNs are able to detect any distinguishing signals that encryption has left exposed.

What would settle it

A demonstration that the same CNN architecture, trained on fresh synthetic data, achieves high-accuracy identification of specific hidden services when tested on independent real-world I2P traffic traces would falsify the result.

Figures

Figures reproduced from arXiv: 2605.11606 by Dieter Arnold, Konrad Baechler, Luca Rohrer.

Figure 1
Figure 1. Figure 1: Schema of a naive mix-net and the use of onion routing [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Schema of Inbound Tunnel creation and the process of anonymous writing to the network database using an [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Experimental framework. A. Laboratory Setting The laboratory environment is isolated from the real, public network and used for generating synthetic network traffic. Multiple I2P nodes, each within its own Docker container, run on a single host [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Timing diagram from the sender 10.8.0.2 to the first node of the receiver’s inbound tunnel 10.8.0.11. [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of I2P network traffic based on all recorded Wireshark data. [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Outgoing tunnels from the sender [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Number of packets per port for TCP and UDP to node 10.8.0.11. [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Top 15 TCP and UDP packet sizes to node 10.8.0.11 [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Entropy of the payload of selected packets. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Estimated number of I2P nodes over time worldwide. [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: shows how one of the few reseed servers in the I2P network receives requests from I2P nodes over time. A clear 24-hour fluctuation pattern is visible, likely caused by I2P nodes in the U.S. time zone regularly leaving and rejoining the network [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Elbow Method for Determining the Optimal Number of Clusters [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Architecture of the CNN 26 [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗
read the original abstract

This study investigates the potential for deanonymizing services within the Invisible Internet Project (I2P) network through passive traffic analysis and machine learning techniques. The primary objective is to identify distinctive patterns in I2P traffic despite the encryption of its payload. To achieve this, a controlled laboratory environment was established to generate synthetic I2P traffic, providing a training dataset for machine learning models. Furthermore, Fano's inequality is employed to perform a theoretical analysis of anonymous data transmission in mix networks such as I2P, thereby supporting a data-driven approach to uncover causal relationships. In computer experiments, advanced deep learning methods - particularly Convolutional Neural Networks - are applied within the laboratory I2P network, and their effectiveness is further evaluated using real-world traffic data. The results indicate that the proposed methodologies do not compromise the anonymity guarantees of the I2P network.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript investigates deanonymization of I2P services via passive traffic analysis using convolutional neural networks. It generates synthetic I2P traffic in a controlled lab for model training, invokes Fano's inequality for a theoretical bound on anonymity in mix networks, applies CNNs to both synthetic and real-world I2P traffic, and concludes that the proposed methods do not compromise I2P anonymity guarantees.

Significance. If the central empirical claim holds after addressing ground-truth issues, the work would supply concrete evidence that I2P traffic remains resistant to CNN-based passive attacks, even when models are trained on synthetic data and tested on live captures. This would be a useful data point for the anonymous-communication literature, particularly as deep-learning traffic analysis becomes more common.

major comments (3)
  1. [Experiments section] Real-world evaluation (Experiments section): The claim that CNN performance on real-world I2P traffic demonstrates no effective deanonymization assumes access to independently verifiable ground-truth labels (service identity, origin, or destination). Obtaining such labels in a live anonymous network without active injection or external side-channel knowledge contradicts the anonymity properties under test; if labels are synthetic, self-generated, or proxy-based, the measured accuracy cannot be interpreted as evidence that the attack fails in practice.
  2. [§3] Theoretical analysis (Abstract and §3): Fano's inequality is invoked to support a data-driven approach to anonymity, yet no specific derivation, application to I2P traffic features, or equation linking the bound to the CNN feature space is provided. Without this, it is unclear whether the inequality supplies an independent limit or merely restates the empirical observation in general terms.
  3. [Methodology and Experiments sections] Synthetic-to-real generalization (Methodology and Experiments sections): The central no-compromise conclusion rests on models trained exclusively on laboratory synthetic traffic being evaluated on real-world captures. No quantitative validation (e.g., statistical comparison of flow statistics, packet-size distributions, or timing features) is reported to confirm that the synthetic data distribution matches live I2P usage, leaving open the possibility that the models simply fail to capture real distinguishing features.
minor comments (3)
  1. [Abstract] Abstract: No numerical results (accuracy, precision, recall, dataset sizes, or error bars) are stated, preventing even a preliminary assessment of the strength of the 'no compromise' conclusion.
  2. The manuscript should supply the CNN architecture diagram, layer dimensions, hyperparameter choices, and training/validation split details to support reproducibility.
  3. A dedicated limitations subsection discussing the gap between lab-generated and live I2P traffic would strengthen the presentation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with point-by-point responses and indicate where revisions will be made to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Experiments section] Real-world evaluation (Experiments section): The claim that CNN performance on real-world I2P traffic demonstrates no effective deanonymization assumes access to independently verifiable ground-truth labels (service identity, origin, or destination). Obtaining such labels in a live anonymous network without active injection or external side-channel knowledge contradicts the anonymity properties under test; if labels are synthetic, self-generated, or proxy-based, the measured accuracy cannot be interpreted as evidence that the attack fails in practice.

    Authors: We agree that verifiable ground-truth labels are critical for valid interpretation. In our real-world evaluation, the traffic was generated by services that we operated and controlled within the live I2P network; this allowed direct, self-verifiable labeling of flows by service identity without requiring side-channel information from other users. The captured traffic is genuine I2P traffic subject to the network's mixing and encryption. We will revise the Experiments section to explicitly describe this controlled-operator setup and discuss its scope and limitations relative to fully blind real-world attacks. revision: yes

  2. Referee: [§3] Theoretical analysis (Abstract and §3): Fano's inequality is invoked to support a data-driven approach to anonymity, yet no specific derivation, application to I2P traffic features, or equation linking the bound to the CNN feature space is provided. Without this, it is unclear whether the inequality supplies an independent limit or merely restates the empirical observation in general terms.

    Authors: We accept that the current presentation of the theoretical analysis lacks sufficient detail. In the revised manuscript we will expand §3 with an explicit derivation of Fano's inequality applied to the anonymity setting of mix networks, including the relevant equations that relate the CNN classification error probability to the conditional entropy of service identity given the observed traffic features. This will clarify the connection between the theoretical bound and the empirical CNN results. revision: yes

  3. Referee: [Methodology and Experiments sections] Synthetic-to-real generalization (Methodology and Experiments sections): The central no-compromise conclusion rests on models trained exclusively on laboratory synthetic traffic being evaluated on real-world captures. No quantitative validation (e.g., statistical comparison of flow statistics, packet-size distributions, or timing features) is reported to confirm that the synthetic data distribution matches live I2P usage, leaving open the possibility that the models simply fail to capture real distinguishing features.

    Authors: We acknowledge the absence of quantitative distribution matching in the current version. We will add a dedicated subsection (or appendix) that reports statistical comparisons between the synthetic and real-world datasets, including packet-size histograms, inter-arrival time distributions, flow duration statistics, and results from appropriate tests such as the Kolmogorov-Smirnov test. These additions will either support the generalization claim or allow us to qualify the conclusions accordingly. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical CNN evaluation and Fano bound are independent of inputs

full rationale

The paper trains CNN classifiers on laboratory-generated synthetic I2P flows and reports performance on separate real-world captures, then invokes Fano's inequality as an external information-theoretic bound on mix-network anonymity. No equation, parameter fit, or self-citation is shown to redefine the target quantity (deanonymization success) in terms of the training labels or model outputs. The central claim therefore does not reduce to its own inputs by construction and remains falsifiable against external traffic traces.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that lab-generated traffic matches real I2P behavior and that standard ML generalization holds; Fano's inequality is treated as a standard information-theoretic bound.

axioms (1)
  • standard math Fano's inequality provides a valid upper bound on information leakage in mix networks such as I2P
    Invoked to support theoretical analysis of anonymous data transmission.

pith-pipeline@v0.9.0 · 5444 in / 1021 out tokens · 42572 ms · 2026-05-13T01:28:59.898820+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    I2P, (w.y), A Gentle Introduction How I2P Works (o. J.). last retrieval July 12, 2025,https://geti2p.net/en/ docs/how/intro

  2. [2]

    Popper, K. R. (1959). The Logic of Scientific Discovery. London: Hutchinson

  3. [3]

    Egger, C., Schlumberger, J., Kruegel, C., Vigna, G. (2013). Practical Attacks Against The I2P Network (S

  4. [4]

    Friedrich-Alexander University Erlangen–Nuremberg und University of California, Santa Barbara.https://sites.cs.ucsb.edu/ ˜chris/research/doc/raid13_i2p.pdf

    [Technischer Report]. Friedrich-Alexander University Erlangen–Nuremberg und University of California, Santa Barbara.https://sites.cs.ucsb.edu/ ˜chris/research/doc/raid13_i2p.pdf

  5. [5]

    I2P, ”What do we mean by ”anonymous”?”,https://geti2p.net/en/docs/how/threat-model

  6. [6]

    Shahbar, K., Zincir-Heywood, A. N. (2017, Mai). Effects of Shared Bandwidth on Anonymity of the I2P Network Users.https://doi.org/10.1109/SPW.2017.19

  7. [7]

    Anonymity Services Tor, I2P, JonDonym: Classifying in the Dark (Web)

    Antonio Montieri, Domenico Ciuonzo, Giuseppe Aceto, Antonio Pescap ´e. (2018). “Anonymity Services Tor, I2P, JonDonym: Classifying in the Dark (Web).” IEEE Transactions on Dependable and Secure Computing, 1–14

  8. [8]

    Lotfollahi, Mohammad and Shirali Hossein Zade, Ramin and Jafari Siavoshani, Mahdi and Saberian, Mohammad, ”Deep Packet: A novel approach for encrypted traffic classification using deep learning”,https://arxiv.org/ abs/1709.02656, 2019

  9. [9]

    Zhao, Qing and Wang, Lei, ”Research on Network Traffic Protocol Classification Based on CNN-LSTM Model”, Journal of Telecommunications and Information Engineering, vol. 12, no. 3, pages 45-56, 2025,https://www. sci-open.net/index.php/JTIE/article/view/2325

  10. [10]

    David L. Chaum. 1981. Untraceable electronic mail, return addresses, and digital pseudonyms. Commun. ACM 24, 2 (Feb. 1981), 84–90.https://doi.org/10.1145/358549.358563

  11. [11]

    G., Syverson P

    Reed M. G., Syverson P. F., Goldschlag D. M. (1998) ”Anonymous connections and onion routing”, IEEE Journal on Selected Areas in Communications, 16(4):482–494

  12. [12]

    Invisible Internet Network Protocol (I2NP), Revision 0.9, 28 August, 2003, last retrieval July 12, 2025http:// geti2p.org/_static/pdf/I2NP_spec.pdf

  13. [13]

    Calibrating noise to sensitivity in private data analysis.Theory of Cryptography Conference

    Dwork, Cynthia and McSherry, Frank and Nissim, Kobbi and Smith, Adam. Calibrating noise to sensitivity in private data analysis.Theory of Cryptography Conference. pages 265-284, Springer, 2006

  14. [14]

    Feng, Tianyi and Zhang, Zhixiang and Wong, Wai-Choong and Sun, Sumei and Sikdar, Biplab. (2024). ”A Framework for Tradeoff Between Location Privacy Preservation and Quality of Experience in Location Based Services”, pp. 428- 439,https://doi.org/10.1109/OJVT.2024.3364184

  15. [15]

    Pfitzmann, A., Hansen, M. (2010). A Terminology for Talking about Privacy by Data Minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management (No. v0.34). TU Dresden and ULD Kiel,http://dud.inf.tu-dresden.de/literatur/Anon_Terminology_v0.34.pdf

  16. [16]

    Cai, Xiang and Zhang, Xin Cheng and Joshi, Brijesh and Johnson, Rob. (2012). ”Touching from a distance: website fingerprinting attacks and defenses”, Association for Computing Machinery, pp.605–616,https://doi.org/10. 1145/2382196.2382260

  17. [17]

    Transmission of Information

    Fano, RM. “Transmission of Information”, the M.I.T. Press and John Wiley and Sons, New York & London, 1961

  18. [18]

    Monitoring the I2P network

    Juan Pablo Timpanaro, Chrisment Isabelle, Festor Olivier. Monitoring the I2P network. [Research Report] RR-7844, INRIA. 2011.〈hal-00653136〉https://inria.hal.science/hal-00653136

  19. [19]

    Monitoring an anonymity network: Toward the deanonymization of hidden services,

    Marco Simioni, Pavel Gladyshev, Babak Habibnia, Paulo Roberto Nunes de Souza, (2021) “Monitoring an anonymity network: Toward the deanonymization of hidden services,” Forensic Science International: Digital Investigation, V olume 38, Supplement, ISSN 2666-2817,https://doi.org/10.1016/j.fsidi.2021.301135. 28

  20. [20]

    Conceptual Understanding of Convolutional Neural Network – A Deep Learning Approach

    Sakshi Indolia, Anil Kumar Goswami, S. P. Mishra, Pooja Asopa. (2018), “Conceptual Understanding of Convolutional Neural Network – A Deep Learning Approach”. Procedia Computer Science, 132, 679–688,https://doi.org/ 10.1016/j.procs.2018.05.069

  21. [21]

    E., Gunawan, V

    Umargono, E., Suseno, J. E., Gunawan, V . (2020). K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula. Advances in Social Science, Education and Humanities Research, 474, 121–129

  22. [22]

    Zhao, J., Jing, X., Yan, Z., Pedrycz, W. (2021). Network traffic classification for data fusion: A survey. Information Fusion, 72, 22–47.https://doi.org/10.1016/j.inffus.2021.02.009 29