pith. sign in

arxiv: 2209.08166 · v1 · submitted 2022-09-16 · 💻 cs.DS

The trace reconstruction problem for spider graphs

Pith reviewed 2026-05-24 11:16 UTC · model grok-4.3

classification 💻 cs.DS
keywords trace reconstructionspider graphsdeletion channelgraph reconstructionsample complexityrandom deletionsalgorithm
0
0 comments X

The pith

An algorithm reconstructs spider graphs from deletion traces using exp(O((n q^d)^{1/3} / d^{1/3} (log n)^{2/3})) traces for any q.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors study how to reconstruct a spider graph—a root with several legs of equal length d—from multiple traces where each non-root node is deleted independently with probability q. This generalizes the well-known string trace reconstruction problem to a setting with multiple paths from the root. They focus on the case where d is at most log base 1/q of n, so that some legs may be completely deleted in a trace. The main contribution is an algorithm that achieves reconstruction with high probability using a number of traces exponential in O of (n q^d to the 1/3) divided by d to the 1/3 times (log n to the 2/3). This matters because it gives a concrete way to recover structured graphs when data arrives as incomplete paths.

Core claim

We study the trace reconstruction problem for spider graphs. Let n be the number of nodes of a spider and d be the length of each leg, and suppose that we are given independent traces of the spider from a deletion channel in which each non-root node is deleted with probability q. This is a natural generalization of the string trace reconstruction problem in theoretical computer science, which corresponds to the special case where the spider has one leg. In the regime where d ≥ log_{1/q}(n), the problem can be reduced to the vanilla string trace reconstruction problem. We thus study the more interesting regime d ≤ log_{1/q}(n), in which entire legs of the spider are deleted with non-negligent

What carries the argument

The equal-leg spider graph under the uniform deletion channel with probability q, together with an algorithm that groups traces by visible leg patterns before applying string methods.

If this is right

  • When d ≥ log_{1/q}(n) the spider problem reduces to standard string trace reconstruction.
  • The sample bound holds for every deletion probability q in (0,1).
  • Reconstruction succeeds with high probability.
  • The complexity scales with the cube root of the expected number of fully surviving legs n q^d.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If leg lengths vary, a preliminary estimation of common length from observed trace lengths could precede the main algorithm.
  • The grouping technique for handling whole-leg deletions might extend to other rooted trees with multiple branches.
  • Synthetic experiments with the exact parameters would directly test whether the exponent in the sample bound is tight.
  • The approach could inform reconstruction tasks in settings like partial path data from networks where a central node connects to several chains.

Load-bearing premise

The spider has legs of identical length d and deletions are independent and identically distributed with probability q for each non-root node.

What would settle it

Generate traces from spiders with unequal leg lengths using the stated number of samples and check whether the reconstruction success rate drops below high probability.

Figures

Figures reproduced from arXiv: 2209.08166 by Alec Sun, William Yue.

Figure 1
Figure 1. Figure 1: A binary-labeled (12, 4) spider, with indexing system drawn in blue to the bottom left of each vertex. 2 Preliminaries 2.1 Rooted spiders In this section, we define the objects to be reconstructed: rooted binary￾labeled spiders X, as well as an indexing system for their nodes. Definition 2.1. Let n and d be positive integers, and for convenience as￾sume that d | n. An (n, d)-spider X consists of a single u… view at source ↗
Figure 2
Figure 2. Figure 2: An example of the deletion channel applied on a (12 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: All possible padded traces for a certain seed (4 [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

We study the trace reconstruction problem for spider graphs. Let $n$ be the number of nodes of a spider and $d$ be the length of each leg, and suppose that we are given independent traces of the spider from a deletion channel in which each non-root node is deleted with probability $q$. This is a natural generalization of the string trace reconstruction problem in theoretical computer science, which corresponds to the special case where the spider has one leg. In the regime where $d\ge \log_{1/q}(n)$, the problem can be reduced to the vanilla string trace reconstruction problem. We thus study the more interesting regime $d\le \log_{1/q}(n)$, in which entire legs of the spider are deleted with non-negligible probability. We describe an algorithm that reconstructs spiders with high probability using $\exp\left(\mathcal{O}\left(\frac{(nq^d)^{1/3}}{d^{1/3}}(\log n)^{2/3}\right)\right)$ traces. Our algorithm works for all deletion probabilities $q\in(0,1)$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper studies the trace reconstruction problem on spider graphs (a root with multiple legs of identical length d). Traces are generated by independently deleting each non-root node with probability q. When d ≥ log_{1/q}(n) the problem reduces to the standard string trace reconstruction problem. In the complementary regime d ≤ log_{1/q}(n) the authors give an algorithm that succeeds with high probability using exp(O((n q^d)^{1/3} d^{-1/3} (log n)^{2/3})) traces; the algorithm is stated to work for every q ∈ (0,1).

Significance. If the stated algorithm and analysis are correct, the result supplies the first non-trivial sample bound for trace reconstruction on any non-path graph family. The bound depends on the effective number of surviving legs (n q^d) and recovers the known string bound (up to polynomial factors in d and log n) when d is small. The clean reduction for large d and the explicit handling of whole-leg deletions are technically useful extensions of the string case.

minor comments (2)
  1. [Abstract] The dependence of the sample bound on the number of legs (implicitly n/d) could be stated more explicitly in the abstract and introduction.
  2. A short pseudocode block or high-level diagram of the small-d algorithm (leg-survival detection followed by per-leg reconstruction) would improve readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review and for recommending acceptance. We appreciate the assessment that the result provides the first non-trivial sample bound for trace reconstruction on non-path graphs, along with the recognition of the clean reduction for large d and the handling of whole-leg deletions.

Circularity Check

0 steps flagged

No significant circularity; derivation reduces to external string reconstruction results

full rationale

The paper's central algorithm splits into two regimes. For d ≥ log_{1/q}(n) it explicitly reduces the spider problem to the known string trace reconstruction problem (an external result with its own literature). For the complementary regime d ≤ log_{1/q}(n) it supplies an independent procedure that first detects leg survival and then reconstructs each surviving leg; the stated sample bound follows directly from combining these steps with the known string bound, without any fitted parameters, self-definitional equations, or load-bearing self-citations. No step equates a claimed prediction to its own input by construction, and all cited supporting results are independent of the present work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper introduces no free parameters, invented entities, or non-standard axioms beyond the probabilistic deletion model stated in the abstract.

axioms (2)
  • domain assumption Each non-root node is deleted independently with fixed probability q
    This is the explicit channel model used to generate traces.
  • domain assumption All legs have identical length d
    The spider is defined with uniform leg length.

pith-pipeline@v0.9.0 · 5709 in / 1319 out tokens · 43206 ms · 2026-05-24T11:16:03.584463+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

  1. [1]

    Global alignment of molecular sequences via ances- tral state reconstruction

    Alexandr Andoni, Constantinos Daskalakis, Avinatan Hassidim, and Sebastien Roch. Global alignment of molecular sequences via ances- tral state reconstruction. Stochastic Processes and their Applications , 122(12):3852–3874, 2012

  2. [2]

    Reconstructing strings from random traces

    Tugkan Batu, Sampath Kannan, Sanjeev Khanna, and Andrew Mc- Gregor. Reconstructing strings from random traces. Symposium on Discrete Algorithms, pages 910–918, 2004

  3. [3]

    Trace reconstruction problems in computational biology

    Vinnu Bhardwaj, Pavel A Pevzner, Cyrus Rashtchian, and Yana Sa- fonova. Trace reconstruction problems in computational biology. IEEE Transactions on Information Theory , 67(6):3295–3314, 2020

  4. [4]

    New upper bounds for trace reconstruction

    Zachary Chase. New upper bounds for trace reconstruction. arXiv preprint arXiv:2009.03296, 2020

  5. [5]

    New lower bounds for trace reconstruction

    Zachary Chase. New lower bounds for trace reconstruction. Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques, 57(2):627–643, 2021

  6. [6]

    Near-optimal average-case approximate trace reconstruction from few traces

    Xi Chen, Anindya De, Chin Ho Lee, Rocco A Servedio, and Sandip Sinha. Near-optimal average-case approximate trace reconstruction from few traces. In Proceedings of the 2022 Annual ACM-SIAM Sym- posium on Discrete Algorithms (SODA) , pages 779–821. SIAM, 2022

  7. [7]

    Coded trace reconstruction

    Mahdi Cheraghchi, Ryan Gabrys, Olgica Milenkovic, and Joao Ribeiro. Coded trace reconstruction. IEEE Transactions on Information The- ory, 66(10):6084–6103, 2020

  8. [8]

    Reconstructing trees from traces

    Sami Davies, Miklos Z Racz, and Cyrus Rashtchian. Reconstructing trees from traces. In Conference On Learning Theory , pages 961–978. PMLR, 2019

  9. [9]

    Approximate trace reconstruction: Algorithms

    Sami Davies, Mikl´ os Z R´ acz, Benjamin G Schiffer, and Cyrus Rashtchian. Approximate trace reconstruction: Algorithms. In 2021 IEEE International Symposium on Information Theory (ISIT) , pages 2525–2530. IEEE, 2021

  10. [10]

    Optimal mean- based algorithms for trace reconstruction

    Anindya De, Ryan O’Donnell, and Rocco A Servedio. Optimal mean- based algorithms for trace reconstruction. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages 1047–1056, 2017. 15

  11. [11]

    Trace reconstruction with varying deletion probabilities

    Lisa Hartung, Nina Holden, and Yuval Peres. Trace reconstruction with varying deletion probabilities. In 2018 Proceedings of the Fifteenth Workshop on Analytic Algorithmics and Combinatorics (ANALCO) , pages 54–61. SIAM, 2018

  12. [12]

    Lower bounds for trace reconstruction

    Nina Holden and Russell Lyons. Lower bounds for trace reconstruction. The Annals of Applied Probability , 30(2):503–525, 2020

  13. [13]

    Subpoly- nomial trace reconstruction for random strings and arbitrary deletion probability

    Nina Holden, Robin Pemantle, Yuval Peres, and Alex Zhai. Subpoly- nomial trace reconstruction for random strings and arbitrary deletion probability. Mathematical Statistics and Learning , 2(3):275–309, 2020

  14. [14]

    Trace reconstruction with constant deletion probability and related results

    Thomas Holenstein, Michael Mitzenmacher, Rina Panigrahy, and Udi Wieder. Trace reconstruction with constant deletion probability and related results. In Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms , pages 389–398. Citeseer, 2008

  15. [15]

    Capture and translocation characteristics of short branched dna labels in solid-state nanopores

    Philipp Karau and Vincent Tabard-Cossa. Capture and translocation characteristics of short branched dna labels in solid-state nanopores. ACS sensors, 3(7):1308–1315, 2018

  16. [16]

    Trace reconstruction: Generalized and parame- terized

    Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, and Soumyabrata Pal. Trace reconstruction: Generalized and parame- terized. IEEE Transactions on Information Theory , 67(6):3233–3250, 2021

  17. [17]

    Levenshtein

    V. Levenshtein. Reconstruction of objects from a minimum number of distorted patterns. Doklady Mathematics, 55(3):417–420, 1997

  18. [18]

    Trace recon- struction revisited

    Andrew McGregor, Eric Price, and Sofya Vorotnikova. Trace recon- struction revisited. In European Symposium on Algorithms, pages 689–

  19. [19]

    Circular trace reconstruction

    Shyam Narayanan and Michael Ren. Circular trace reconstruction. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Schloss Dagstuhl-Leibniz-Zentrum f¨ ur Informatik, 2021

  20. [20]

    Trace reconstruction with exp(O(n1/3)) samples

    Fedor Nazarov and Yuval Peres. Trace reconstruction with exp(O(n1/3)) samples. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages 1042–1046, 2017

  21. [21]

    Improved string reconstruction over insertion-deletion channels

    Krishnamurthy Viswanathan and Ram Swaminathan. Improved string reconstruction over insertion-deletion channels. In Proceedings of 16 the nineteenth annual ACM-SIAM symposium on Discrete algorithms , pages 399–408, 2008. 17