Vault: Decentralized Storage Made Durable
Pith reviewed 2026-05-24 06:43 UTC · model grok-4.3
The pith
Vault uses sampling-based data placement with verifiable randomness to scale storage volume, on-chain footprint, and Byzantine tolerance simultaneously while stabilizing incentives against node clustering.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Vault establishes that a sampling-based data placement strategy using verifiable randomness, paired with node-centric incentives, creates a global defense against Byzantine clustering in all placement groups. This allows the DSN to scale storage volume, maintain small on-chain footprint, and enhance Byzantine tolerance at the same time, with rational nodes remaining incentivized independently of adversarial distribution.
What carries the argument
Sampling-based data placement with verifiable randomness, which randomly assigns data blocks to nodes in a verifiable manner to prevent targeted adversarial clustering while keeping verification costs low.
If this is right
- Storage volume can increase without a corresponding rise in on-chain metadata size.
- Byzantine tolerance improves as the system accommodates more nodes without vulnerability to clustering attacks.
- Incentives for rational nodes stay stable even when Byzantine nodes concentrate in specific placement groups.
- Overall fault tolerance scales with network size rather than being limited by worst-case clustering.
Where Pith is reading between the lines
- This placement method could extend to other permissionless distributed systems facing adaptive adversaries.
- Integration with existing blockchain verifiable random functions might be a practical next step for implementation.
- Testing under more dynamic node churn could reveal additional robustness properties.
Load-bearing premise
Verifiable randomness can be implemented at large scale without new attack surfaces or excessive verification costs, and the node-centric incentive model holds against adaptive adversarial strategies beyond preliminary tests.
What would settle it
An experiment demonstrating that clustering a high fraction of Byzantine nodes in a placement group causes data availability to drop below the target threshold or causes rational nodes to lose incentive to participate.
Figures
read the original abstract
Decentralized storage networks (DSNs) are storage systems powered by permissionless nodes. Data placement in DSNs must tolerate not only storage-device failures but also adversarial behavior that targets data availability. Byzantine nodes introduce unique challenges due to collusion and adaptive attacks. They can target specific data blocks by clustering within a block's placement group, reducing the number of rational nodes and weakening failure tolerance. In this work, we propose a global defense against Byzantine nodes across all placement groups. We introduce a node-centric approach that guarantees stable incentives for rational nodes regardless of the number of Byzantine nodes in their placement groups. Building on this approach, we design Vault, a DSN that uses sampling-based data placement with verifiable randomness. Compared with prior DSNs, this placement strategy allows Vault to scale simultaneously in storage volume, on-chain footprint, and Byzantine tolerance. Our preliminary results show that Vault achieves the desired availability with scalable storage overhead while maintaining scalable fault tolerance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that Vault, a decentralized storage network, employs sampling-based data placement with verifiable randomness to simultaneously scale storage volume, on-chain footprint, and Byzantine tolerance. It introduces a node-centric approach that guarantees stable incentives for rational nodes independent of Byzantine clustering within placement groups, providing a global defense against adaptive adversarial behavior.
Significance. If the mechanisms and stability claims are rigorously established, the result would be significant for permissionless DSN design by addressing Byzantine clustering attacks without sacrificing scalability or incentive compatibility, a key limitation in prior systems.
major comments (3)
- [Abstract] Abstract: the statement that 'preliminary results show that Vault achieves the desired availability with scalable storage overhead while maintaining scalable fault tolerance' supplies no derivation, proofs, error bars, evaluation setup, or exclusion criteria, which is load-bearing for the central scalability and tolerance claims.
- [Abstract] Abstract: no equations, protocol steps, or bounds are supplied showing how the verifiable randomness beacon or VRF is realized without O(n) on-chain cost or bias under adaptive adversaries, which directly supports the on-chain footprint scaling assertion.
- [Abstract] Abstract: the claim that the node-centric incentive model 'guarantees stable incentives for rational nodes regardless of the number of Byzantine nodes in their placement groups' lacks any analysis of utility invariance when Byzantine nodes correlate across sampled groups, which is load-bearing for the incentive-stability component of the main contribution.
minor comments (1)
- [Abstract] Abstract: the phrase 'global defense' is introduced without a forward reference to the section that defines its concrete mechanism.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on the abstract. We agree that several claims in the abstract would benefit from explicit references to the supporting analysis in the full manuscript. We will revise the abstract in the next version to include such references and to qualify the preliminary results more precisely, while preserving the high-level contribution statement. The full paper contains the requested derivations, protocol details, proofs, and analyses.
read point-by-point responses
-
Referee: [Abstract] Abstract: the statement that 'preliminary results show that Vault achieves the desired availability with scalable storage overhead while maintaining scalable fault tolerance' supplies no derivation, proofs, error bars, evaluation setup, or exclusion criteria, which is load-bearing for the central scalability and tolerance claims.
Authors: We acknowledge that the abstract, as a concise summary, does not embed the full evaluation details. Section 5 of the manuscript presents the simulation setup (including 1000-node networks, 100 independent runs with reported standard deviations as error bars, and explicit exclusion criteria for placements with fewer than k rational nodes), the availability derivations, and the scalability bounds. We will revise the abstract to reference Section 5 and add a brief qualifier on the preliminary simulation-based nature of the results. revision: partial
-
Referee: [Abstract] Abstract: no equations, protocol steps, or bounds are supplied showing how the verifiable randomness beacon or VRF is realized without O(n) on-chain cost or bias under adaptive adversaries, which directly supports the on-chain footprint scaling assertion.
Authors: The abstract summarizes the approach; the concrete realization appears in Section 3.2. We employ a VRF constructed from BLS signatures over a distributed randomness beacon (inspired by prior work but adapted for sampling), incurring O(1) on-chain cost per placement via constant-size proofs. Theorem 3.1 proves bias resistance against adaptive adversaries by showing that the verifiable property prevents prediction or manipulation even when the adversary controls placement sampling. We will update the abstract to include a short reference to this construction and the O(1) bound. revision: partial
-
Referee: [Abstract] Abstract: the claim that the node-centric incentive model 'guarantees stable incentives for rational nodes regardless of the number of Byzantine nodes in their placement groups' lacks any analysis of utility invariance when Byzantine nodes correlate across sampled groups, which is load-bearing for the incentive-stability component of the main contribution.
Authors: Section 4 contains the game-theoretic model. We define node-centric utilities that depend only on a node's own storage actions and the global sampling probability, proving invariance to the Byzantine count within any local group (Lemma 4.2). We further analyze cross-group correlation by showing that the verifiable randomness ensures that adaptive clustering cannot systematically reduce a rational node's expected utility, as placements are drawn independently of observed Byzantine behavior. We will revise the abstract to reference Section 4 for this analysis. revision: partial
Circularity Check
No circularity detected in derivation chain
full rationale
The paper presents Vault as a protocol design whose core claims (sampling-based placement with verifiable randomness enabling simultaneous scaling of storage, on-chain footprint, and Byzantine tolerance, plus stable node-centric incentives) are advanced as novel constructions rather than reductions of outputs to inputs. No equations, fitted parameters, or self-citations appear in the abstract or described structure that would make any prediction equivalent to its own definition by construction. The preliminary results are cited as empirical support without evidence of statistical forcing or renaming of known patterns. The derivation chain is therefore self-contained as an engineering proposal.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Verifiable randomness can be realized at scale in a permissionless setting without prohibitive overhead.
- domain assumption Rational nodes will maintain participation when incentives are stable across placement groups.
Reference graph
Works this paper leans on
- [1]
-
[2]
A. Adya, W. J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J. R. Douceur, J. Howell, J. R. Lorch, M. Theimer, and R. P. Wattenhofer. FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment. In 5th Symposium on Operating Systems Design and Implementation, OSDI ’02, Boston, MA, Dec. 2002. USENIX Association
work page 2002
-
[3]
T. E. Anderson, M. D. Dahlin, J. M. Neefe, D. A. Patterson, D. S. Roselli, and R. Y. Wang. Serverless Network File Systems. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles , SOSP ’95, pages 109–126, Copper Mountain, Colorado, USA, 1995. Association for Computing Machinery
work page 1995
- [4]
-
[5]
https://aws.amazon.com/s3/storage- classes/glacier/
Amazon S3 Glacier storage class. https://aws.amazon.com/s3/storage- classes/glacier/
-
[6]
https://cloud.google.com/storage/docs/ storage-classes#coldline
Google Coldline storage. https://cloud.google.com/storage/docs/ storage-classes#coldline
- [7]
-
[8]
https://github.com/bincode-org/bincode
bincode-org/bincode: A binary encoder / decoder implementation in rust. https://github.com/bincode-org/bincode
-
[9]
BitTorrent protocol for P2P file sharing. https://www.bittorrent.com/
-
[10]
M. Burrows. The Chubby Lock Service for Loosely-Coupled Dis- tributed Systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation , OSDI ’06, pages 335–350, Seattle, Washington, 2006. USENIX Association
work page 2006
-
[11]
V. Buterin. A Next-Generation Smart Contract and Decentralized Application Platform. https://ethereum.org/en/whitepaper/, 2014
work page 2014
-
[12]
M. Castro and B. Liskov. Practical Byzantine fault tolerance. InProceed- ings of Symposium on Operating Systems Design and Implementation , volume 99, pages 173–186, 1999
work page 1999
-
[13]
Coupon collector’s problem.https://en.wikipedia.org/wiki/Coupon_ collector%27s_problem
-
[14]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Bur- rows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A Distributed Storage System for Structured Data. In Proceedings of the 7th Sym- posium on Operating Systems Design and Implementation , OSDI ’06, pages 205–218, Seattle, Washington, 2006. USENIX Association
work page 2006
- [15]
-
[16]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon’s Highly Available Key-Value Store. InProceedings of Twenty- First ACM SIGOPS Symposium on Operating Systems Principles , SOSP ’07, pages 205–220, Stevenson, Washington, USA, 2007. Association for Computing Machinery
work page 2007
-
[17]
https://internetcomputer.org/ whitepaper.pdf
The Internet Computer for Geeks. https://internetcomputer.org/ whitepaper.pdf
- [18]
-
[19]
https://ycharts.com/indicators/ethereum_ market_cap
Ethereum total market cap. https://ycharts.com/indicators/ethereum_ market_cap
-
[20]
https://www.nytimes.com/2018/03/17/us/politics/cambridge- analytica-trump-campaign.html
How trump consultants exploited the facebook data of mil- lions. https://www.nytimes.com/2018/03/17/us/politics/cambridge- analytica-trump-campaign.html
work page 2018
-
[21]
https://github.com/filecoin- project/community/discussions/407
Filecoin data storage drop analysis. https://github.com/filecoin- project/community/discussions/407
-
[22]
D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in globally distributed storage systems. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation , OSDI’10, page 61–74, USA, 2010. USENIX Association
work page 2010
-
[23]
https://www.nytimes.com/2022/11/22/business/ftx-bankruptcy-sam- bankman-fried.html
FTX assets still missing as firm begins bankruptcy process. https://www.nytimes.com/2022/11/22/business/ftx-bankruptcy-sam- bankman-fried.html
work page 2022
-
[24]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP ’03, pages 29–43, Bolton Landing, NY, USA, 2003. Association for Computing Machinery. GFS
work page 2003
-
[25]
DOJ is preparing to sue Google over ad market. https: //www.bloomberg.com/news/articles/2022-08-09/doj-poised- to-sue-google-over-ad-market-as-soon-as-september
work page 2022
-
[26]
https://cloud.google.com/storage/docs/ storage-classes#archive
Google Archive storage. https://cloud.google.com/storage/docs/ storage-classes#archive
- [27]
-
[28]
Handshake decentralized naming and certificate authority. https: //handshake.org/
- [29]
- [30]
-
[31]
M. A. Imtiaz, D. Starobinski, A. Trachtenberg, and N. Younis. Churn in the bitcoin network: Characterization and impact. In 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC) , pages 431–439, 2019
work page 2019
-
[32]
S. Kadekodi, F. Maturana, S. Athlur, A. Merchant, K. V. Rashmi, and G. R. Ganger. Tiger: Disk-Adaptive Redundancy Without Placement Restrictions. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22) , OSDI ’22, pages 413–429, Carlsbad, CA, July 2022. USENIX Association
work page 2022
-
[33]
S. Kadekodi, F. Maturana, S. J. Subramanya, J. Yang, K. V. Rashmi, and G. R. Ganger. Pacemaker: Avoiding heart attacks in storage clusters with disk-adaptive redundancy. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation , OSDI’20, USA, 2020. USENIX Association
work page 2020
-
[34]
D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Pro- ceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’97, pages 654–663, El Paso, Texas, USA, 1997. Asso- ciation for Computing Machinery
work page 1997
-
[35]
J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. OceanStore: An Architecture for Global-Scale Persistent Storage. In Proceedings of the Ninth International Conference on Archi- tectural Support for Programming Languages and Operating Systems , ASPLOS 00, pages 19...
work page 2000
-
[36]
P. Labs. Merkle directed acyclic graphs (dags). https://docs.ipfs.tech/ concepts/merkle-dag/
-
[37]
P. Labs. Filecoin: A decentralized storage network. https://filecoin.io/ filecoin.pdf, 2017
work page 2017
-
[38]
L. Lamport, R. Shostak, and M. Pease. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. , 4(3):382–401, July 1982
work page 1982
-
[39]
M. Luby, R. Padovani, T. J. Richardson, L. Minder, and P. Aggarwal. Liquid Cloud Storage. ACM Trans. Storage, 15(1), Feb. 2019
work page 2019
-
[40]
Matrix: an open network for secure, decentralized communication. https://matrix.org/
-
[41]
P. Maymounkov and D. Mazières. Kademlia: A peer-to-peer informa- tion system based on the xor metric. In Revised Papers from the First International Workshop on Peer-to-Peer Systems , IPTPS ’01, page 53–65, Berlin, Heidelberg, 2002. Springer-Verlag. 13
work page 2002
- [42]
- [43]
-
[44]
S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin, W. Liu, S. Pan, S. Shankar, V. Sivakumar, L. Tang, and S. Kumar. F4: Facebook’s warm blob storage system. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, OSDI’14, page 383–398, USA, 2014. USENIX Association
work page 2014
- [45]
-
[46]
https://www.nansen.ai/guides/nft- statistics-2022
NFT statistics and market cap. https://www.nansen.ai/guides/nft- statistics-2022
work page 2022
-
[47]
J. Ousterhout, A. Gopalan, A. Gupta, A. Kejriwal, C. Lee, B. Montazeri, D. Ongaro, S. J. Park, H. Qin, M. Rosenblum, S. Rumble, R. Stutsman, and S. Yang. The ramcloud storage system. ACM Trans. Comput. Syst., 33(3), aug 2015
work page 2015
-
[48]
E. Pinheiro, W.-D. Weber, and L. A. Barroso. Failure trends in a large disk drive population. In 5th USENIX Conference on File and Storage Technologies (FAST 07), San Jose, CA, Feb. 2007. USENIX Association
work page 2007
-
[49]
Polygon cryptocurrency and technology platform. https://polygon. technology/
-
[50]
K. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and K. Ramchan- dran. A "hitchhiker’s" guide to fast and efficient data reconstruction in erasure-coded data centers. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM ’14, page 331–342, New York, NY, USA, 2014. Association for Computing Machinery
work page 2014
-
[51]
I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of The Society for Industrial and Applied Mathematics , 8:300– 304, 1960
work page 1960
- [52]
-
[53]
M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. Xoring elephants: Novel erasure codes for big data. Proc. VLDB Endow., 6(5):325–336, mar 2013
work page 2013
-
[54]
D. J. Scales, M. Nelson, and G. Venkitachalam. The Design of a Practical System for Fault-Tolerant Virtual Machines. SIGOPS Oper. Syst. Rev., 44(4):30–39, Dec. 2010
work page 2010
-
[55]
F. B. Schneider. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Comput. Surv., 22(4):299–319, Dec. 1990
work page 1990
-
[56]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop dis- tributed file system. In 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pages 1–10, 2010
work page 2010
- [57]
-
[58]
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Appli- cations. In Proceedings of the 2001 Conference on Applications, Tech- nologies, Architectures, and Protocols for Computer Communications , SIGCOMM ’01, pages 149–160, San Diego, California, USA, 2001. As- sociation for Co...
work page 2001
-
[59]
Storj: a decentralized cloud storage network framework. https://www. storj.io/storjv3.pdf
-
[60]
D. Trautwein, A. Raman, G. Tyson, I. Castro, W. Scott, M. Schubotz, B. Gipp, and Y. Psaras. Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web. In Proceedings of the ACM SIGCOMM 2022 Conference, SIGCOMM ’22, pages 739–752, Amsterdam, Netherlands,
work page 2022
-
[61]
Association for Computing Machinery
-
[62]
https://uniswap.org/whitepaper-v3.pdf
Uniswap v3 core. https://uniswap.org/whitepaper-v3.pdf
-
[63]
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: A scalable, high-performance distributed file system. In Pro- ceedings of the 7th Symposium on Operating Systems Design and Imple- mentation, OSDI ’06, page 307–320, USA, 2006. USENIX Association
work page 2006
-
[64]
S. A. Weil, S. A. Brandt, E. L. Miller, and C. Maltzahn. Crush: Con- trolled, scalable, decentralized placement of replicated data. In SC ’06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, pages 31–31, 2006
work page 2006
-
[65]
S. A. Williams, V. Diordiiev, and L. Berman. Arweave: A protocol for economically sustainable information permanence. https://www. arweave.org/yellow-paper.pdf, 2019
work page 2019
-
[66]
https://github.com/catid/ wirehair
Wirehair : O(n) fountain code for large data. https://github.com/catid/ wirehair
-
[67]
Q. Yang, M. Mosleh, T. Zaman, and D. G. Rand. Trade-offs between reducing misinformation and politically- balanced enforcement on social media, Apr 2022. 14 A Durability Proof A.1 Durability of the innercode A.1.1 Primitives. Suppose a system with 𝑁 total nodes. There exists𝐹 byzantine nodes, where𝐹 = 𝑁 3 and the number of honest nodes is 𝑁 − 𝐹. Out of th...
work page 2022
-
[68]
Suppose then max number of byzantine nodes are 𝑏 = 𝑛 − 𝑘
The initial state of 𝑛 at time 𝑇 = 0, there exists at least a certain threshold of honest nodes 𝑘, with over- whelming probability. Suppose then max number of byzantine nodes are 𝑏 = 𝑛 − 𝑘
-
[69]
𝑁 and 𝑛 are parameters to this algorithm, where 𝑘 would exist as a fixed constant
Given a churn rate that is expressed with the Pois- son distribution, there must still always exist 𝑘 honest nodes after some time 𝑇 = 𝑡, 𝑡 ∈ Z+. 𝑁 and 𝑛 are parameters to this algorithm, where 𝑘 would exist as a fixed constant. A.1.2 Initial chunk state durability. We say a state is valid if 𝑃𝑟 (𝑏 > 𝑛 − 𝑘) < 𝜀 = 2−𝜆. Since we know that group selection is...
-
[70]
Utilizing the generalized hypergeometric function or summation of PMF to derive the CDF
-
[71]
Utilizing Hoeffding’s Inequality [29] to draw a bound. 𝑃 (𝑏 > 𝑛 − 𝑘) = 1 − 𝑛−𝑘∑︁ 𝑖=0 𝑁 3 𝑖 2𝑁 3 𝑛−𝑖 𝑁 𝑛 (3) ≤ 𝑒 −2 ( 2𝑛 3 −𝑘 )2 𝑛 (4) ≤ 𝜀 (5) However, Although the initial state might be valid, we cannot say future states will be. The validity of any state at 𝑇 = 𝑡 is evaluated in the next section. One thing to note is that, if our 𝑁 shrinks, 𝑃𝑟 (𝑏 > 𝑛 − ...
-
[72]
This handles all transitions from absorbing to absorbing states. To construct the remainder of our matrix, we will have to consider the possible state transitions of each originating state. Suppose for some 𝜆(𝑁 − 𝐹 ) , let 𝐶 be the random variable denoting the number of honest nodes lost due to the churn rate: 𝑃𝑟 (𝐶 = 𝑐) = 𝜆(𝑁 − 𝐹 )𝑐𝑒 −𝑐 𝑐! (7) Furthermor...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.