pith. machine review for the scientific record. sign in

arxiv: 2605.14435 · v1 · submitted 2026-05-14 · 💻 cs.NI · cs.CR

Recognition: 2 theorem links

· Lean Theorem

Geographic Patterns in I2P Peer Selection: An Empirical Network Topology Analysis

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:39 UTC · model grok-4.3

classification 💻 cs.NI cs.CR
keywords I2Ppeer selectiongeographic homophilynetwork topologyassortativityanonymity networksrouting topologypermutation testing
0
0 comments X

The pith

I2P peer selection produces random geographic mixing with no significant country clustering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The study tests whether geographic location shapes I2P's decentralized routing by measuring homophily in a dataset of 327 routers and 254 connections. Assortativity analysis yields r = 0.017 (p = 0.222), and same-country links appear at 11.1 percent, statistically indistinguishable from the 10.91 percent rate expected under random selection constrained by the /16 subnet rule. Community detection identifies 110 modular groups that align only moderately with geography. The results indicate that I2P's aggregate peer choices create heterogeneous global mixing rather than localized clusters.

Core claim

Empirical assortativity and permutation tests on the SWARM-I2P snapshot establish a network-level absence of significant geographic homophily; observed same-country connections match the null expectation generated by I2P's design rule against multiple peers from the same /16 subnet, and detected communities show only moderate geographic coherence.

What carries the argument

Assortativity coefficient with permutation testing on router geographic labels, used to quantify deviation from random mixing under the /16 subnet constraint.

If this is right

  • I2P's aggregate peer selection produces highly heterogeneous, random geographical mixing.
  • The observed pattern supplies an empirical baseline for analyzing the performance-anonymity tradeoff in I2P.
  • Community structures exist but remain only moderately aligned with country boundaries.
  • No systematic geographic bias appears at the network level.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Random geographic mixing may reduce the effectiveness of location-targeted de-anonymization attacks.
  • Similar measurements could be repeated on Tor or other overlay networks to compare mixing properties.
  • Average path lengths might increase due to longer typical geographic distances between peers.

Load-bearing premise

The 327-router SWARM-I2P snapshot and its 254 connections represent the full I2P network, and the permutation model accurately captures the null distribution of random peer selection.

What would settle it

A full-network crawl showing same-country connection frequency significantly above or below the 10.91 percent random baseline would falsify the absence of homophily.

read the original abstract

The Invisible Internet Project (I2P) routes data via encrypted, decentralized tunnels. Peer selection can significantly affect security and performance. This empirical study examines whether geographic location systematically influences I2P's routing topology. Consistent with I2P's design principles, which include avoiding multiple peers from the same /16 IP subnet to maximize anonymity, we conducted assortativity analysis, community detection, and permutation testing on data from 327 routers and 254 connections (SWARM-I2P). We found a network-level absence of significant geographic homophily. The assortativity coefficient was r = 0.017 (p = 0.222). Same-country connections (11.1%) are statistically near random expectation (10.91%). Community detection found 110 highly modular groups (Q = 0.972) only moderately aligned geographically (NMI = 0.521). We conclude that aggregate peer selection in I2P leads to a highly heterogeneous, random geographical mixing, providing a foundation for understanding the performance-anonymity tradeoff.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript presents an empirical network topology analysis of the I2P anonymous network using the SWARM-I2P dataset (327 routers, 254 connections). It applies assortativity analysis, community detection, and permutation testing to test for geographic homophily, finding none: the assortativity coefficient is r = 0.017 (p = 0.222), same-country connections are 11.1% versus a random baseline of 10.91%, and detected communities (Q = 0.972) show only moderate geographic alignment (NMI = 0.521). The authors conclude that aggregate peer selection produces heterogeneous, random geographical mixing consistent with I2P's /16 subnet rule and anonymity goals.

Significance. If the empirical findings hold, the work supplies a concrete, data-driven baseline for the absence of geographic homophily in I2P. This directly informs the performance-anonymity tradeoff in decentralized routing and supplies a reproducible snapshot against which future protocol changes or larger-scale measurements can be compared. The use of a permutation test under the observed degree sequence and /16 constraint is a clear methodological strength.

minor comments (2)
  1. [§3] §3 (Data Collection): the exact crawling window, router-discovery method, and any filtering applied to obtain the final 327 routers / 254 edges should be stated more explicitly so that replication is unambiguous.
  2. [Figure 2] Figure 2 and Table 1: axis labels and caption should explicitly note that the permutation baseline respects the /16 subnet constraint; without this the visual comparison to random expectation is harder to interpret.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive evaluation of the manuscript, recognition of its methodological contributions, and recommendation for acceptance. We appreciate the acknowledgment that the permutation test under the observed degree sequence and /16 constraint represents a clear strength.

Circularity Check

0 steps flagged

Purely empirical analysis with no circular derivations or self-referential steps

full rationale

The paper performs direct statistical measurements on the collected SWARM-I2P snapshot (327 routers, 254 edges). Assortativity coefficient r = 0.017 (p = 0.222), same-country edge fraction (11.1% observed vs 10.91% expected), modularity Q = 0.972, and NMI = 0.521 are all computed from the observed adjacency matrix using standard formulas. The permutation test rewires edges while respecting the external I2P /16 subnet rule to generate a null distribution; this is a conventional Monte-Carlo baseline, not a fitted parameter or self-defined quantity. No equations, ansatzes, or uniqueness theorems are invoked that reduce the reported results to the input data by construction. The central claim of absent geographic homophily is therefore an independent empirical finding against an externally specified null, with no load-bearing self-citations or renamings of known patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on the assumption that the collected snapshot and the permutation null model are valid; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption The permutation test correctly models random peer selection given I2P's /16 subnet avoidance rule.
    Used to generate the 10.91% random expectation for same-country links.

pith-pipeline@v0.9.0 · 5494 in / 1175 out tokens · 36194 ms · 2026-05-15T01:39:07.773372+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 3 canonical work pages · 1 internal anchor

  1. [1]

    Optimizing Anonymity and Efficiency: A Critical Review of Path Selection Strategies in Tor

    Muntaka SA, Bou Abdo J. Optimizing Anonymity and Efficiency: A Critical Review of Path Selection Strategies in Tor. In: IEEE. 2025:1– 8

  2. [2]

    Modeling the Invisible Internet

    Bou Abdo J, Hossain L. Modeling the Invisible Internet. In: Inter- national Conference on Complex Networks and Their Applications. Springer. 2023:359–370

  3. [3]

    Resilience of the Invisible Internet Project: A Computational Analysis

    Muntaka SA, Bou Abdo J. Resilience of the Invisible Internet Project: A Computational Analysis. Internet Technology Letters. 2025;8(5):e70119. doi: 10.1002/itl2.70119

  4. [4]

    Mapping The Invisible Internet: Framework and Dataset

    Muntaka SA, Bou Abdo J, Akanbi K, et al. Mapping The Invisible Internet: Framework and Dataset. Data in Brief. 2025;63:112175. doi: 10.1016/j.dib.2025.112175

  5. [5]

    A systematic survey on security in anonymity networks: Vulnerabilities, attacks, de- fenses, and formalization

    Chao D, Xu D, Gao F, Zhang C, Zhang W, Zhu L. A systematic survey on security in anonymity networks: Vulnerabilities, attacks, de- fenses, and formalization. IEEE Communications Surveys & Tutorials. 2024;26(3):1775–1829

  6. [6]

    A Survey on Tor and I2P

    Conrad B, Shirazi F. A Survey on Tor and I2P. Ninth International Conference on Internet Monitoring and Protection (ICIMP 2014). 2014:22–28

  7. [7]

    I2p-the invisible internet project

    Astolfi F, Kroese J, Van Oorschot J. I2p-the invisible internet project. Leiden University Web Technology Report. 2015

  8. [8]

    Open Research Questions – I2P

    The I2P Project . Open Research Questions – I2P. [Online]. Available: urlhttps://geti2p.net/en/research/questions; . Accessed: Oct. 22, 2025. [Updated: May 2018]

  9. [9]

    The anonymity of the dark web: A survey

    Saleem J, Islam R, Kabir MA. The anonymity of the dark web: A survey. Ieee Access. 2022;10:33628–33660

  10. [10]

    An empirical study of the i2p anonymity network and its censorship resistance

    Hoang NP, Kintis P, Antonakakis M, Polychronakis M. An empirical study of the i2p anonymity network and its censorship resistance. In: 2018:379–392

  11. [11]

    Network Fin- gerprinting Using Machine Learning for Anonymous Networking Detection in Cryptocurrency

    Islam A, Sakib N, Zhang K, Wuthier S, Chang SY . Network Fin- gerprinting Using Machine Learning for Anonymous Networking Detection in Cryptocurrency. In: IEEE. 2025:1–6

  12. [12]

    Darknet Threats and Detection Strategies: A Concise Overview

    Obaidat MJ, Al-Syouf IA, Awawdeh YF, Masa’deh AE, Al-Haija QA. Darknet Threats and Detection Strategies: A Concise Overview. In: IEEE. 2025:1–6

  13. [13]

    Intelligent garlic routing for securing data exchange in v2x communication

    Jadav NK, Gupta R, Tanwar S, Bhattacharya P. Intelligent garlic routing for securing data exchange in v2x communication. In: IEEE. 2022:286–291

  14. [14]

    P-I2Prange: An Automatic Construction Architecture for Scenarios in I2P Ranges

    Tan R, Tan Q, Wang H, Xie Y , Zhang P. P-I2Prange: An Automatic Construction Architecture for Scenarios in I2P Ranges. In: IEEE. 2024:1–10

  15. [15]

    Evaluation of the anonymous I2P network’s design choices against performance and security

    Timpanaro JP, Cholez T, Chrisment I, Festor O. Evaluation of the anonymous I2P network’s design choices against performance and security. In: IEEE. 2015:1–10

  16. [16]

    Measuring {I2P} censorship at a global scale

    Hoang NP, Doreen S, Polychronakis M. Measuring {I2P} censorship at a global scale. In: 2019

  17. [17]

    Invisible Internet Project and Spatial Restrictions: A Systemic Vulnerability

    Akanbi K, Krayem T, Muntaka SA, Bou Abdo J. Invisible Internet Project and Spatial Restrictions: A Systemic Vulnerability. 2025

  18. [18]

    AS-awareness in Tor path selection

    Edman M, Syverson P. AS-awareness in Tor path selection. In: 2009:380–389

  19. [19]

    Website fingerprinting in onion routing based anonymization networks

    Panchenko A, Niessen L, Zinnen A, Engel T. Website fingerprinting in onion routing based anonymization networks. In: 2011:103–114

  20. [20]

    The structure and function of complex networks

    Newman ME. The structure and function of complex networks. SIAM review.2003;45(2):167–256

  21. [21]

    Fast unfolding of communities in large networks

    Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment. 2008;2008(10):P10008

  22. [22]

    A universal assortativity measure for network analysis

    Zhang GQ, Cheng SQ, Zhang GQ. A universal assortativity measure for network analysis. arXiv preprint arXiv:1212.6456. 2012

  23. [23]

    Assortativity measures for weighted and directed networks

    Yuan Y , Yan J, Zhang P. Assortativity measures for weighted and directed networks. Journal of Complex Networks. 2021;9(2):cnab017

  24. [24]

    Comparing community structure identification

    Danon L, Diaz-Guilera A, Duch J, Arenas A. Comparing community structure identification. Journal of statistical mechanics: Theory and experiment. 2005;2005(09):P09008