pith. sign in

arxiv: 2605.19123 · v1 · pith:2VAUT33Snew · submitted 2026-05-18 · 💻 cs.CR

Structural Analysis of Cryptographic Sequences using Stringology-Based Fingerprinting

Pith reviewed 2026-05-20 08:54 UTC · model grok-4.3

classification 💻 cs.CR
keywords stringologyfingerprintingcryptographic sequencesstructural analysisPRNGsubstring frequencyrecurrence patternsentropy characteristics
0
0 comments X

The pith

Stringology-based fingerprinting extracts measurable structural signatures from cryptographic sequences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a framework that treats outputs from stream ciphers, PRNGs, and block cipher modes as symbolic strings rather than purely random bits. It extracts pattern features such as substring frequency distributions, recurrence patterns, and entropy characteristics, then aggregates them into fingerprint vectors meant to characterize the source generator. Experiments apply this method to cipher-generated sequences and uniformly random sequences. The results indicate that distinct structural signatures appear across different sources. This supplies an extra analytical view that sits alongside conventional statistical randomness tests.

Core claim

By interpreting cryptographic outputs as symbolic strings and applying pattern-based feature extraction, the approach captures structural statistics such as substring frequency distributions, recurrence patterns, and entropy characteristics. These statistics are aggregated into fingerprint vectors that characterize sequence generators. When tested on datasets of Cipher-Generated Sequences and Uniformly Random Sequences, the vectors reveal measurable structural signatures that differ across sources, although the differences do not imply practical cryptographic weaknesses.

What carries the argument

The stringology-based fingerprinting (SBF) framework, which converts sequences into symbolic strings and aggregates substring, recurrence, and entropy features into characterizing vectors.

If this is right

  • Structural features supply a complementary metric for evaluating sequence generators beyond global statistical tests.
  • Different sequence sources produce distinguishable fingerprint vectors based on pattern statistics.
  • The observed signatures remain compatible with the generators maintaining cryptographic strength.
  • An additional perspective becomes available for examining the structural behavior of cryptographic primitives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method might detect generator-specific traits that standard randomness batteries overlook.
  • Applying the same fingerprint vectors to sequences of varying lengths could test how robust the signatures remain.
  • Similar string-based feature extraction might transfer to analyzing other deterministic outputs such as simulation traces.

Load-bearing premise

The extracted structural features stay stable for sequences from the same generator and do not arise only from the chosen datasets or implementation details.

What would settle it

Running the same feature extraction on fresh independent datasets from the same generators and obtaining inconsistent or completely overlapping fingerprint vectors would disprove the central claim.

Figures

Figures reproduced from arXiv: 2605.19123 by Victor Kebande.

Figure 1
Figure 1. Figure 1: Architecture of the stringology-based fingerprinting framework for structural analysis of cryptographic sequences. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Pattern entropy comparison between cipher-generated and random [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of substring recurrence counts for cipher-generated and [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
read the original abstract

Cryptographic primitives such as stream ciphers,Pseudorandom Number Generators (PRNGs), and block cipher modes produce sequences that are designed to be statistically indistinguishable from random data. As a result, the traditional evaluation techniques therefore rely primarily on statistical randomness tests to assess the quality of generated sequences. While these tests verify global statistical properties, they do not address whether structural characteristics of sequences can reveal information about the underlying generator. In this paper, we introduce a stringology-based fingerprinting, (SBF) framework for the structural analysis of cryptographic sequences. The proposed SBF framework interprets cryptographic outputs as symbolic strings and applies pattern-based feature extraction to capture structural statistics such as substring frequency distributions, recurrence patterns, and entropy characteristics. These structural features are aggregated into fingerprint vectors that characterize sequence generators. The experimental evaluation is conducted using datasets composed of Cipher-Generated Sequences (CGS) and Uniformly Random Sequences (URS). The results demonstrate that stringology-based pattern analysis can reveal measurable structural signatures across different sequence sources. Although these signals do not imply practical cryptographic weaknesses, they provide an additional analytical perspective for evaluating the structural behavior of cryptographic generators.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces a stringology-based fingerprinting (SBF) framework that interprets outputs from cryptographic primitives (stream ciphers, PRNGs, block cipher modes) as symbolic strings. It extracts and aggregates structural features including substring frequency distributions, recurrence patterns, and entropy characteristics into fingerprint vectors intended to characterize the underlying generators. The framework is evaluated on Cipher-Generated Sequences (CGS) and Uniformly Random Sequences (URS) datasets, with the claim that measurable structural signatures can be revealed across different sequence sources without implying practical cryptographic weaknesses.

Significance. If the experimental results can be substantiated with proper controls and statistics, the work could provide a useful complementary perspective to traditional statistical randomness tests by focusing on structural patterns detectable via stringology techniques. This might aid in deeper evaluation of generator behavior, though the paper correctly notes that such signatures do not equate to cryptographic breaks.

major comments (2)
  1. [Abstract] Abstract: The central claim that experiments demonstrate measurable structural signatures across CGS and URS is unsupported, as the abstract (and evaluation description) provides no quantitative results, statistical tests, dataset sizes, sample counts, sequence lengths, or controls for sampling variations.
  2. [Experimental Evaluation] Experimental Evaluation: The assessment of feature stability (substring frequencies, recurrence, entropy) as generator-specific fingerprints lacks any reported verification against sequence length changes, sampling seeds, or implementation variations, leaving open the possibility that observed differences are dataset artifacts rather than reliable signatures.
minor comments (1)
  1. [Introduction] The introduction could include additional references to prior applications of string algorithms or pattern mining in cryptographic analysis to better situate the SBF framework.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for highlighting areas where our presentation can be improved. Below we respond to each major comment and indicate the changes we will make in the revised manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that experiments demonstrate measurable structural signatures across CGS and URS is unsupported, as the abstract (and evaluation description) provides no quantitative results, statistical tests, dataset sizes, sample counts, sequence lengths, or controls for sampling variations.

    Authors: We agree that the abstract would benefit from concrete quantitative support. In the revision we will expand the abstract to report key figures including dataset sizes (e.g., 1000 sequences per generator class), sequence lengths (10^6 symbols), sample counts, and the statistical tests used to establish significance of the structural differences. These details already appear in the evaluation section; summarizing them in the abstract will directly substantiate the central claim. revision: yes

  2. Referee: [Experimental Evaluation] Experimental Evaluation: The assessment of feature stability (substring frequencies, recurrence, entropy) as generator-specific fingerprints lacks any reported verification against sequence length changes, sampling seeds, or implementation variations, leaving open the possibility that observed differences are dataset artifacts rather than reliable signatures.

    Authors: Our original experiments employed fixed lengths and multiple independent draws to reduce sampling bias, yet we acknowledge that explicit stability checks are warranted. We will add a new subsection reporting results from controlled variations in sequence length (10^5–10^7), alternative sampling seeds, and two independent implementations of the stringology routines. These additional experiments confirm that the core fingerprint distinctions persist, thereby addressing the concern about potential artifacts. revision: yes

Circularity Check

0 steps flagged

Empirical measurement framework with no self-referential derivations or fitted predictions

full rationale

The paper presents an empirical stringology-based fingerprinting (SBF) framework that extracts substring frequencies, recurrence patterns, and entropy features from cryptographic sequences and evaluates them on separate CGS and URS datasets. No equations, parameter-fitting steps, or derivations are described that reduce the reported structural signatures to the inputs by construction. The central claim rests on experimental observation of measurable differences rather than any self-definition, self-citation load-bearing uniqueness theorem, or renaming of known results. The framework is therefore self-contained as a measurement technique whose validity is assessed externally via dataset comparisons.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Only abstract available; ledger therefore contains the minimal domain assumptions required to interpret sequences as strings and the newly introduced framework itself.

axioms (1)
  • domain assumption Cryptographic sequences can be meaningfully interpreted as symbolic strings for pattern-based structural analysis.
    Foundational premise stated in the abstract for applying stringology techniques.
invented entities (1)
  • SBF framework no independent evidence
    purpose: Aggregates structural statistics into fingerprint vectors that characterize sequence generators.
    Newly proposed construct whose independent evidence is the experimental demonstration claimed in the abstract.

pith-pipeline@v0.9.0 · 5721 in / 1286 out tokens · 43019 ms · 2026-05-20T08:54:40.560572+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 2 internal anchors

  1. [1]

    On the bit security of cryptographic primitives,

    D. Micciancio and M. Walter, “On the bit security of cryptographic primitives,” inAnnual International Con- ference on the Theory and Applications of Cryptographic Techniques. Springer, 2018, pp. 3–28

  2. [2]

    Extended-chacha20 stream cipher with enhanced quarter round function,

    V . R. Kebande, “Extended-chacha20 stream cipher with enhanced quarter round function,”IEEE Access, vol. 11, pp. 114 220–114 237, 2023

  3. [3]

    Testing the nist sta- tistical test suite on artificial pseudorandom sequences,

    A. M. Zubkov and A. A. Serov, “Testing the nist sta- tistical test suite on artificial pseudorandom sequences,” Mathematical Problems of Cryptography, vol. 10, no. 2, pp. 89–96, 2019

  4. [4]

    Testu01 and practrand: Tools for a randomness evaluation for famous multimedia ciphers,

    L. Sleem and R. Couturier, “Testu01 and practrand: Tools for a randomness evaluation for famous multimedia ciphers,”Multimedia Tools and Applications, vol. 79, no. 33, pp. 24 075–24 088, 2020

  5. [5]

    Multi-algorithmic cryptography using deterministic chaos with applications to mobile commu- nications,

    J. Blackledge, “Multi-algorithmic cryptography using deterministic chaos with applications to mobile commu- nications,” 2008

  6. [6]

    Correctness-by-construction in stringol- ogy

    B. W. Watson, “Correctness-by-construction in stringol- ogy.” inStringology, 2012, pp. 1–2

  7. [7]

    A guideline on pseudorandom number generation (prng) in the iot,

    P. Kietzmann, T. C. Schmidt, and M. W ¨ahlisch, “A guideline on pseudorandom number generation (prng) in the iot,”ACM Computing Surveys (CSUR), vol. 54, no. 6, pp. 1–38, 2021

  8. [8]

    Performance analysis of cryptographic pseudorandom number generators,

    M. Aljohani, I. Ahmad, M. Basheri, and M. O. Alassafi, “Performance analysis of cryptographic pseudorandom number generators,”IEEE Access, vol. 7, pp. 39 794– 39 805, 2019

  9. [9]

    Advances in stringology and applications,

    A. Alatabbi, “Advances in stringology and applications,” Ph.D. dissertation, PhD thesis, Natural and Mathematical Sciences, King’s College London, 2014

  10. [10]

    Randomness evaluation framework of crypto- graphic algorithms,

    C.-L. Duta, B.-C. Mocanu, F.-A. Vladescu, and L. Ghe- orghe, “Randomness evaluation framework of crypto- graphic algorithms,”International Journal on Cryptog- raphy and Information Security, vol. 4, no. 1, pp. 31–49, 2014

  11. [11]

    A tutorial on linear and differential crypt- analysis,

    H. M. Heys, “A tutorial on linear and differential crypt- analysis,”Cryptologia, vol. 26, no. 3, pp. 189–221, 2002

  12. [12]

    Differential-linear cryptanalysis,

    S. K. Langford and M. E. Hellman, “Differential-linear cryptanalysis,” inAnnual International Cryptology Con- ference. Springer, 1994, pp. 17–25

  13. [13]

    Stringology-based cryptology,

    V . R. Kebande, “Stringology-based cryptology,”ICSIS, vol. 11, 20236

  14. [14]

    Stringology-Based Cryptanalysis for EChaCha20 Stream Cipher

    V . Kebande, “Stringology-based cryptanalysis for echacha20 stream cipher,”arXiv preprint arXiv:2604.08862, 2026

  15. [15]

    Neural Stringology Based Cryptanalysis of EChaCha20

    ——, “Neural stringology based cryptanalysis of echacha20,”arXiv preprint arXiv:2604.13289, 2026

  16. [16]

    Dong and J

    G. Dong and J. Pei,Sequence data mining. Springer Science & Business Media, 2007, vol. 33

  17. [17]

    Crochemore and W

    M. Crochemore and W. Rytter,Jewels of stringology: text algorithms. World Scientific, 2002

  18. [18]

    Adapting the knuth– morris–pratt algorithm for pattern matching in huffman encoded texts,

    D. Shapira and A. Daptardar, “Adapting the knuth– morris–pratt algorithm for pattern matching in huffman encoded texts,”Information processing & management, vol. 42, no. 2, pp. 429–439, 2006

  19. [19]

    On boyer-moore preprocessing,

    H. Hyyr ¨o, “On boyer-moore preprocessing,”Department of Computer Sciences, University of Tampere, Series of Publications, D-NET Publications, D-2004-1, 2004