Recognition: unknown
DEMUX: Boundary-Aware Multi-Scale Traffic Demixing for Multi-Tab Website Fingerprinting
Pith reviewed 2026-05-10 08:56 UTC · model grok-4.3
The pith
DEMUX demixes interleaved multi-tab traffic by preserving boundaries and associating dispersed fragments across scales.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
No existing method meets all three structural requirements for multi-tab demixing at once. DEMUX satisfies them with a Boundary Preserving Aggregation Module that uses overlapping windows and joint packet-burst features, a Multi-Scale Parallel CNN with parallel branches for heterogeneous patterns, and a two-stage Transformer encoder equipped with Rotary Positional Embedding for cross-window fragment association. The same aggregation module works as a plug-and-play preprocessor. In closed-world experiments with five concurrent tabs, the method reaches a P@5 of 0.943 and MAP@5 of 0.961, exceeding the strongest baseline by 9.2 and 6.2 percentage points.
What carries the argument
Boundary Preserving Aggregation Module that performs overlapping window partitioning together with joint packet-level and burst-level feature extraction, paired with a Multi-Scale Parallel CNN and a two-stage Transformer encoder using Rotary Positional Embedding.
Load-bearing premise
The three structural requirements for multi-tab demixing are both necessary and jointly sufficient, and the proposed modules meet them without creating new failure modes or overfitting to particular datasets.
What would settle it
Collect a new set of five-tab closed-world traces under different network conditions or tab-opening orders and rerun the evaluation; if DEMUX no longer exceeds the strongest baseline by several points, the claim that the modules jointly solve the demixing problem would be undermined.
Figures
read the original abstract
Website fingerprinting (WF) attacks infer the websites visited by users from encrypted traffic in anonymous networks such as Tor. Existing deep learning methods achieve high accuracy under the single-tab assumption but degrade substantially when users open multiple tabs concurrently, producing interleaved traffic that transforms WF into an implicit demixing problem. We identify three structural requirements for effective multi-tab demixing, namely signal integrity at segment boundaries, multi-scale local modeling, and relative temporal association of dispersed fragments, and show that no prior method satisfies all three simultaneously. We propose DEMUX, a designed framework that addresses these requirements through three tightly coupled components. A Boundary Preserving Aggregation Module employs overlapping window partitioning with joint packet-level and burst-level feature extraction. A Multi-Scale Parallel CNN captures heterogeneous temporal patterns via parallel branches. A two-stage Transformer encoder with Rotary Positional Embedding enables robust cross-window fragment association. The Boundary Preserving Aggregation Module additionally serves as a plug-and-play preprocessor that consistently improves existing baselines without architectural modification. Extensive experiments across closed-world, open-world, defense-augmented, dynamic-tab, and cross-configuration settings demonstrate that DEMUX achieves state-of-the-art performance. In the challenging closed-world 5-tab setting, DEMUX attains a P@5 of 0.943 and MAP@5 of 0.961, outperforming the strongest baseline by 9.2 and 6.2 percentage points respectively, confirming its strong robustness in complex multi-tab demixing scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that multi-tab website fingerprinting on Tor traffic is an implicit demixing problem requiring three structural properties (boundary signal integrity, multi-scale local modeling, and relative temporal association of fragments) that no prior method satisfies simultaneously. It proposes DEMUX with three tightly coupled components—a Boundary Preserving Aggregation module using overlapping windows, a Multi-Scale Parallel CNN, and a two-stage RoPE Transformer—to address these properties, plus a plug-and-play preprocessor that improves existing baselines. Experiments across closed-world, open-world, defense, dynamic-tab, and cross-configuration settings report SOTA results, including P@5 of 0.943 and MAP@5 of 0.961 in the closed-world 5-tab case (9.2 and 6.2 pp gains over the strongest baseline).
Significance. If the reported gains are robust and attributable to the proposed design choices, the work meaningfully advances WF attacks under realistic concurrent-tab conditions where single-tab assumptions break down. The preprocessor's ability to improve unmodified baselines is a practical strength that could see adoption, and the explicit mapping of requirements to modules provides a clearer design rationale than purely empirical prior approaches.
major comments (2)
- [Experiments] Experiments section: the central attribution of the 9.2 pp P@5 gain in the closed-world 5-tab setting to the three modules satisfying the identified requirements is not supported by ablation or sensitivity studies. No component-wise removal experiments, capacity-matched controls, or controlled variation of window overlap / branch count are reported, leaving open the possibility that gains arise from overall model size, training protocol, or dataset-specific factors rather than the claimed structural fixes.
- [Section 4] Section 4 (or equivalent, describing the preprocessor): while the Boundary Preserving Aggregation module is stated to improve existing baselines as a plug-and-play component, the manuscript provides no quantitative breakdown of which baselines were tested, the exact magnitude of improvement per baseline, or whether the improvement holds after hyperparameter re-tuning of the baselines themselves.
minor comments (2)
- [Abstract and Experiments] The abstract and experimental tables should explicitly state the number of independent runs, random seeds, and whether hyperparameter search was performed jointly or per baseline to allow assessment of selection effects.
- [Methods] Notation for the overlapping window partitioning and the two-stage Transformer should be introduced with a small diagram or pseudocode in the methods section for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for major revision. We address the two major comments point by point below, acknowledging where the manuscript is currently lacking and outlining the planned revisions.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the central attribution of the 9.2 pp P@5 gain in the closed-world 5-tab setting to the three modules satisfying the identified requirements is not supported by ablation or sensitivity studies. No component-wise removal experiments, capacity-matched controls, or controlled variation of window overlap / branch count are reported, leaving open the possibility that gains arise from overall model size, training protocol, or dataset-specific factors rather than the claimed structural fixes.
Authors: We agree that the manuscript does not contain explicit ablation or sensitivity studies that would allow direct attribution of the reported gains to the three proposed modules. While the experiments demonstrate consistent outperformance across closed-world, open-world, defense-augmented, dynamic-tab, and cross-configuration settings, this does not fully exclude contributions from model capacity or training details. In the revised manuscript we will add (i) component-wise removal ablations, (ii) capacity-matched control models, and (iii) sensitivity analyses on window overlap and branch count to strengthen the causal link between the identified requirements and the observed improvements. revision: yes
-
Referee: [Section 4] Section 4 (or equivalent, describing the preprocessor): while the Boundary Preserving Aggregation module is stated to improve existing baselines as a plug-and-play component, the manuscript provides no quantitative breakdown of which baselines were tested, the exact magnitude of improvement per baseline, or whether the improvement holds after hyperparameter re-tuning of the baselines themselves.
Authors: We acknowledge that the manuscript currently states the plug-and-play benefit without providing a per-baseline quantitative breakdown or results after hyperparameter re-tuning. Although internal evaluations supported the claim of consistent improvement, the presentation lacks the requested detail. In the revision we will insert a dedicated table and accompanying text that (a) lists all baselines evaluated with the preprocessor, (b) reports the exact performance deltas for each, and (c) includes results obtained after re-tuning the baselines' own hyperparameters. revision: yes
Circularity Check
No significant circularity; empirical claims rest on held-out evaluation against external baselines.
full rationale
The paper states three structural requirements for multi-tab demixing, designs three modules to address them, and reports empirical gains (e.g., +9.2 pp P@5 in closed-world 5-tab) on held-out test sets versus prior baselines. No equations, fitted parameters, or first-principles derivations are presented; the architecture is trained end-to-end and the preprocessor is shown to improve unmodified baselines. No self-citation chain, self-definitional steps, or renaming of known results appears in the load-bearing claims. The central result is therefore externally falsifiable and does not reduce to its own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- Hyperparameters of the CNN branches, transformer layers, and window overlap sizes
axioms (1)
- domain assumption The three structural requirements (boundary integrity, multi-scale modeling, fragment association) are the primary bottlenecks in prior multi-tab WF methods.
Reference graph
Works this paper leans on
-
[1]
Tor: The second- generation onion router,
R. Dingledine, N. Mathewson, and P. Syverson, “Tor: The second- generation onion router,” 2004
2004
-
[2]
Seeing traffic paths: Encrypted traffic classification with path signature features,
S.-J. Xu, G.-G. Geng, X.-B. Jin, D.-J. Liu, and J. Weng, “Seeing traffic paths: Encrypted traffic classification with path signature features,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 2166– 2181, 2022
2022
-
[3]
Website finger- printing in onion routing based anonymization networks,
A. Panchenko, L. Niessen, A. Zinnen, and T. Engel, “Website finger- printing in onion routing based anonymization networks,” inProceedings of the 10th annual ACM workshop on Privacy in the electronic society, 2011, pp. 103–114
2011
-
[4]
k-fingerprinting: A robust scalable web- site fingerprinting technique,
J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable web- site fingerprinting technique,” in25th USENIX Security Symposium (USENIX Security 16), 2016, pp. 1187–1203
2016
-
[5]
Website fingerprinting at internet scale
A. Panchenko, F. Lanze, J. Pennekamp, T. Engel, A. Zinnen, M. Henze, and K. Wehrle, “Website fingerprinting at internet scale.” inNDSS, vol. 1, 2016, p. 23477
2016
-
[6]
p-fp: Extraction, classification, and prediction of website fingerprints with deep learning,
S. E. Oh, S. Sunkam, and N. Hopper, “p-fp: Extraction, classification, and prediction of website fingerprints with deep learning,”Proceedings on Privacy Enhancing Technologies, vol. 3, pp. 191–209, 2019
2019
-
[7]
Trafficsliver: Fighting website fingerprinting attacks with traffic splitting,
W. De la Cadena, A. Mitseva, J. Hiller, J. Pennekamp, S. Reuter, J. Filter, T. Engel, K. Wehrle, and A. Panchenko, “Trafficsliver: Fighting website fingerprinting attacks with traffic splitting,” inProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, 2020, pp. 1971–1985
2020
-
[8]
Deep fingerprinting: Undermining website fingerprinting defenses with deep learning,
P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: Undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, 2018, pp. 1928–1943
2018
-
[9]
Automated website fingerprinting through deep learning,
V . Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen, “Automated website fingerprinting through deep learning,” inNetwork and Distributed System Security Symposium. IEEE Internet Society, 2018, pp. 1–15
2018
-
[10]
S. Bhat, D. Lu, A. Kwon, and S. Devadas, “Var-cnn: A data-efficient website fingerprinting attack based on deep learning,”arXiv preprint arXiv:1802.10215, 2018
-
[11]
More realistic website fingerprinting using deep learning,
W. Cui, T. Chen, and E. Chan-Tin, “More realistic website fingerprinting using deep learning,” in2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2020, pp. 333–343
2020
-
[12]
Snwf: Website fingerprinting attack by ensembling the snapshot of deep learning,
Y . Wang, H. Xu, Z. Guo, Z. Qin, and K. Ren, “Snwf: Website fingerprinting attack by ensembling the snapshot of deep learning,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 1214– 1226, 2022
2022
-
[13]
Towards an efficient defense against deep learning based website fingerprinting,
Z. Ling, G. Xiao, W. Wu, X. Gu, M. Yang, and X. Fu, “Towards an efficient defense against deep learning based website fingerprinting,” in IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2022, pp. 310–319
2022
-
[14]
Toward an effective few-shot website fingerprinting attack with quadruplet networks and deep local fingerprinting features,
H. Zou, J. Su, Z. Wei, S. Chen, C. Yang, and M. Chen, “Toward an effective few-shot website fingerprinting attack with quadruplet networks and deep local fingerprinting features,”IEEE Transactions on Dependable and Secure Computing, 2025
2025
-
[15]
Cross-environmental website fingerprinting,
J. Li, D. Wang, Y . Liu, Y . Gao, X. Zhang, Z. Lin, X. Ma, X. Luo, and X. Guan, “Cross-environmental website fingerprinting,” inIEEE INFO- COM 2025-IEEE Conference on Computer Communications. IEEE, 2025, pp. 1–10
2025
-
[16]
Bapm: block attention profiling model for multi-tab website fingerprinting attacks on tor,
Z. Guan, G. Xiong, G. Gou, Z. Li, M. Cui, and C. Liu, “Bapm: block attention profiling model for multi-tab website fingerprinting attacks on tor,” inProceedings of the 37th Annual Computer Security Applications Conference, 2021, pp. 248–259
2021
-
[17]
Robust multi-tab website fingerprinting attacks in the wild,
X. Deng, Q. Yin, Z. Liu, X. Zhao, Q. Li, M. Xu, K. Xu, and J. Wu, “Robust multi-tab website fingerprinting attacks in the wild,” in2023 IEEE symposium on security and privacy (SP). IEEE, 2023, pp. 1005– 1022
2023
-
[18]
Transformer-based model for multi- tab website fingerprinting attack,
Z. Jin, T. Lu, S. Luo, and J. Shang, “Transformer-based model for multi- tab website fingerprinting attack,” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 1050–1064
2023
-
[19]
Towards robust multi-tab website fingerprinting,
X. Deng, X. Zhao, Q. Yin, Z. Liu, Q. Li, M. Xu, K. Xu, and J. Wu, “Towards robust multi-tab website fingerprinting,”arXiv preprint arXiv:2501.12622, 2025
-
[20]
Effective attacks and provable defenses for website fingerprinting,
T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg, “Effective attacks and provable defenses for website fingerprinting,” in23rd USENIX Security Symposium (USENIX Security 14), 2014, pp. 143– 157
2014
-
[21]
Wtf- pad: toward an efficient website fingerprinting defense for tor,
M. Ju ´arez, M. Imani, M. Perry, C. Dıaz, and M. Wright, “Wtf- pad: toward an efficient website fingerprinting defense for tor,”CoRR, abs/1512.00524, 2015
-
[22]
Tik-tok: The utility of packet timing in website finger- printing attacks,
M. S. Rahman, P. Sirinam, N. Mathews, K. G. Gangadhara, and M. Wright, “Tik-tok: The utility of packet timing in website finger- printing attacks,”arXiv preprint arXiv:1902.06421, 2019
-
[23]
Triplet fingerprinting: More practical and portable website fingerprinting with n-shot learning,
P. Sirinam, N. Mathews, M. S. Rahman, and M. Wright, “Triplet fingerprinting: More practical and portable website fingerprinting with n-shot learning,” inProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, 2019, pp. 1131–1148
2019
-
[24]
Online website fingerprinting: Evaluating website fingerprinting attacks on tor in the real world,
G. Cherubin, R. Jansen, and C. Troncoso, “Online website fingerprinting: Evaluating website fingerprinting attacks on tor in the real world,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 753–770
2022
-
[25]
Realistic website fingerprinting by augmenting network traces,
A. Bahramali, A. Bozorgi, and A. Houmansadr, “Realistic website fingerprinting by augmenting network traces,” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 1035–1049
2023
-
[26]
Subverting website fingerprinting defenses with robust traffic representation,
M. Shen, K. Ji, Z. Gao, Q. Li, L. Zhu, and K. Xu, “Subverting website fingerprinting defenses with robust traffic representation,” in 32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 607–624
2023
-
[27]
Laserbeak: Evolving website fingerprinting attacks with attention and multi-channel feature representation,
N. Mathews, J. K. Holland, N. Hopper, and M. Wright, “Laserbeak: Evolving website fingerprinting attacks with attention and multi-channel feature representation,”IEEE Transactions on Information Forensics and Security, 2024
2024
-
[28]
A multi- tab website fingerprinting attack,
Y . Xu, T. Wang, Q. Li, Q. Gong, Y . Chen, and Y . Jiang, “A multi- tab website fingerprinting attack,” inProceedings of the 34th Annual Computer Security Applications Conference, 2018, pp. 327–341
2018
-
[29]
Zero-delay lightweight defenses against website fingerprinting,
J. Gong and T. Wang, “Zero-delay lightweight defenses against website fingerprinting,” in29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 717–734
2020
-
[30]
Attention is all you need,
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017
2017
-
[31]
Roformer: En- hanced transformer with rotary position embedding,
J. Su, M. Ahmed, Y . Lu, S. Pan, W. Bo, and Y . Liu, “Roformer: En- hanced transformer with rotary position embedding,”Neurocomputing, vol. 568, p. 127063, 2024
2024
-
[32]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. 14
2016
-
[33]
Batch normalization: Accelerating deep network training by reducing internal covariate shift,
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” inInternational Conference on Machine Learning. PMLR, 2015, pp. 448–456
2015
-
[34]
J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,”arXiv preprint arXiv:1607.06450, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[35]
Auc: a statistically consistent and more discriminating measure than accuracy,
C. X. Ling, J. Huang, H. Zhanget al., “Auc: a statistically consistent and more discriminating measure than accuracy,” inIjcai, vol. 3, 2003, pp. 519–524
2003
-
[36]
Robust and reliable early-stage web- site fingerprinting attacks via spatial-temporal distribution analysis,
X. Deng, Q. Li, and K. Xu, “Robust and reliable early-stage web- site fingerprinting attacks via spatial-temporal distribution analysis,” in Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security, 2024
2024
-
[37]
Decoupled weight decay regularization,
I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” inInternational Conference on Learning Representations, 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.