pith. sign in

arxiv: 2602.17973 · v2 · pith:6DX2H746new · submitted 2026-02-20 · 💻 cs.CR · cs.AI

PenTiDef: Decentralized Federated Intrusion Detection System with Differential Privacy and Latent-Space Defense via Blockchain Coordination in IIoT

Pith reviewed 2026-05-22 10:44 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords decentralized federated learningintrusion detection systemdifferential privacyblockchain coordinationIIoT securitypoisoning attackslatent space defensesmart contracts
0
0 comments X

The pith

PenTiDef uses differential privacy, latent-space clustering, and blockchain to build a decentralized federated intrusion detection system that resists poisoning attacks up to 40 percent in IIoT networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents PenTiDef as a fully decentralized framework that lets IIoT devices collaborate on intrusion detection while protecting data privacy and filtering out malicious model updates. It adds stochastic Gaussian noise at each client for differential privacy, compresses penultimate-layer representations through an autoencoder into latent semantic representations, applies centered kernel alignment and K-means clustering to identify poisoned updates without any clean reference data, and uses a permissioned blockchain with smart contracts to handle validation and aggregation. A sympathetic reader would care because many existing federated systems depend on central servers or break down when adversaries control a large share of participants, and this unified approach could support reliable shared security across heterogeneous devices with non-uniform data.

Core claim

PenTiDef integrates client-side distributed differential privacy with Gaussian noise, a latent-space defense that compresses penultimate-layer representations via autoencoder into latent semantic representations then applies centered kernel alignment and K-means clustering to detect malicious updates without auxiliary datasets, and a permissioned blockchain layer with smart contracts for on-chain validation and secure FedAvg aggregation, achieving higher detection accuracy and F1-score than FLARE and FedCC baselines with lower training overhead under adversary ratios up to 40 percent on CIC-IDS2018 and Edge-IIoTSet in both IID and non-IID settings.

What carries the argument

The latent-space defense module, which extracts penultimate-layer representations, compresses them into stable latent semantic representations via autoencoder, and applies centered kernel alignment similarity with K-means clustering to separate malicious updates from benign ones without any auxiliary clean dataset.

If this is right

  • Detection accuracy and F1-scores remain higher than baselines even when data distributions differ across devices.
  • Training overhead stays lower than competing methods at adversary ratios up to 40 percent.
  • Gradient leakage is limited by the addition of stochastic Gaussian noise at each client.
  • The aggregation process gains immutable auditability through on-chain smart contract records.
  • The full system operates without any central server while preserving performance in heterogeneous IIoT environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The unsupervised clustering step could be tested on other federated tasks such as anomaly detection in energy systems or predictive maintenance where reference clean data is unavailable.
  • The blockchain coordination layer may introduce measurable latency that affects real-time response in time-critical IIoT scenarios, an aspect left for future measurement.
  • If the latent-space separation proves stable, it offers a template for adding poisoning resistance to federated learning pipelines in other regulated domains like healthcare or finance.

Load-bearing premise

The latent-space defense can reliably separate malicious updates from benign ones by compressing and clustering penultimate-layer representations without any auxiliary clean dataset or labeled attack examples, and this separation remains effective under realistic non-IID data partitions.

What would settle it

Reproduce the experiments on CIC-IDS2018 under 40 percent adversaries in non-IID partitions and check whether the F1-score falls below the reported values for FLARE or FedCC or whether the K-means clustering on latent representations fails to isolate a substantial fraction of poisoned updates.

read the original abstract

This paper proposes PenTiDef, a fully decentralized, privacy-preserving, and poisoning-resilient framework for decentralized federated IDS (DFL-IDS). PenTiDef synergistically integrates three key components: (i) client-side Distributed Differential Privacy (DDP) with stochastic Gaussian noise to protect gradient leakage, (ii) a lightweight latent-space defense module that extracts and compresses penultimate-layer representations (PLRs) into stable Latent Semantic Representations (LSRs) via AutoEncoder, followed by Centered Kernel Alignment (CKA) and K-Means clustering for robust malicious update detection without auxiliary datasets, and (iii) a permissioned blockchain layer with smart contracts that orchestrates on-chain validation, secure FedAvg aggregation, and immutable auditability, eliminating any central server. Extensive experiments on CIC-IDS2018 and Edge-IIoTSet under both IID and realistic non-IID settings, with adversary ratios up to 40\%, demonstrate that PenTiDef consistently outperforms state-of-the-art baselines (FLARE and FedCC) in detection accuracy and F1-score while maintaining lower training overhead. By jointly addressing privacy, robustness, and decentralization in a unified secure aggregation protocol, PenTiDef provides a practical and scalable solution for trustworthy collaborative intrusion detection in heterogeneous, adversarial IIoT environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes PenTiDef, a fully decentralized federated intrusion detection system for IIoT environments. It combines client-side distributed differential privacy via Gaussian noise, a latent-space defense module that compresses penultimate-layer representations (PLRs) into Latent Semantic Representations (LSRs) using an AutoEncoder, applies Centered Kernel Alignment (CKA) similarity and K-Means clustering to detect malicious updates without auxiliary clean datasets, and a permissioned blockchain layer with smart contracts for on-chain validation and secure FedAvg aggregation. Experiments on CIC-IDS2018 and Edge-IIoTSet under IID and non-IID settings with up to 40% adversaries claim consistent outperformance over FLARE and FedCC in detection accuracy and F1-score with lower training overhead.

Significance. If the latent-space defense module can reliably isolate poisoning attacks from benign non-IID heterogeneity without reference data or labels, the work would represent a meaningful step toward practical, serverless, privacy-preserving, and poisoning-resilient federated IDS in adversarial IIoT settings. The integration of DDP, blockchain coordination, and the claimed robustness under high adversary ratios addresses a relevant gap, though the absence of detailed numerical results, ablations, and hyperparameter specifications in the provided description limits immediate assessment of impact.

major comments (1)
  1. [Latent-space defense module description and experimental setup] The central robustness claim rests on the latent-space defense (AutoEncoder compression of PLRs into LSRs, followed by CKA similarity and K-Means clustering) separating malicious updates from benign non-IID variations without any auxiliary clean dataset or labeled examples. Non-IID client data partitions naturally induce divergent PLRs; the manuscript does not provide a concrete argument or empirical demonstration that CKA distances or cluster assignments will systematically treat these benign divergences as distinct from malicious perturbations. This separation is load-bearing for the reported accuracy and F1 gains under 40% adversaries on both datasets.
minor comments (2)
  1. The abstract states clear outperformance but supplies no numerical tables, specific hyperparameter values (e.g., Gaussian noise variance, K-Means cluster count, CKA threshold), or ablation results; these details should appear in the main text or supplementary material to allow verification of the claimed gains.
  2. Notation for penultimate-layer representations (PLRs) and Latent Semantic Representations (LSRs) should be introduced with explicit equations or pseudocode to clarify the compression and similarity computation steps.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive review of our manuscript on PenTiDef. The feedback on the latent-space defense module is particularly valuable, and we provide a point-by-point response below while committing to enhancements in the revised version.

read point-by-point responses
  1. Referee: [Latent-space defense module description and experimental setup] The central robustness claim rests on the latent-space defense (AutoEncoder compression of PLRs into LSRs, followed by CKA similarity and K-Means clustering) separating malicious updates from benign non-IID variations without any auxiliary clean dataset or labeled examples. Non-IID client data partitions naturally induce divergent PLRs; the manuscript does not provide a concrete argument or empirical demonstration that CKA distances or cluster assignments will systematically treat these benign divergences as distinct from malicious perturbations. This separation is load-bearing for the reported accuracy and F1 gains under 40% adversaries on both datasets.

    Authors: We appreciate the referee's concern regarding the robustness of the latent-space defense against benign non-IID variations. In the manuscript, we posit that the AutoEncoder learns a latent space that encodes semantic similarities in the update representations, making it less sensitive to the distributional shifts caused by non-IID data. The CKA metric, being a centered kernel-based similarity measure, further normalizes for such variations by focusing on the alignment of representations rather than absolute differences. K-Means clustering then identifies the consensus group of benign updates, with malicious ones appearing as anomalies due to their adversarial nature. While the current version includes experimental validation under non-IID conditions showing maintained performance, we acknowledge that a more explicit comparison or visualization of benign vs. malicious clusters could strengthen the presentation. Therefore, we will revise the manuscript to include a more detailed theoretical justification and additional experimental ablations illustrating the separation efficacy. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on external empirical comparisons to FLARE and FedCC rather than self-defined fits or definitions.

full rationale

The paper presents PenTiDef as integrating DDP, a latent-space defense (AE on PLRs + CKA + K-Means without auxiliary data), and blockchain coordination. Central performance claims (higher accuracy/F1 vs. baselines under 40% adversaries on CIC-IDS2018/Edge-IIoTSet in IID/non-IID) are framed as experimental outcomes, not derivations that reduce to their own inputs. No equations or steps are shown to equate a 'prediction' to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation chain. The defense module is described as a novel construction that operates without clean references, but its validity is asserted via reported results rather than tautological redefinition. This yields a low circularity score consistent with papers whose contributions are primarily empirical.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

Framework depends on standard federated-learning assumptions plus one key domain assumption about the discriminative power of compressed penultimate representations.

free parameters (2)
  • Gaussian noise variance in DDP
    Scale of stochastic noise added to gradients; must be chosen to meet privacy budget while preserving utility.
  • K-Means hyperparameters and CKA threshold
    Cluster count, distance metric, and similarity cutoff used to label updates as malicious.
axioms (1)
  • domain assumption Penultimate-layer representations contain sufficient semantic information to distinguish poisoned from clean updates via CKA and clustering without auxiliary data.
    Central premise of the latent-space defense module described in the abstract.

pith-pipeline@v0.9.0 · 5789 in / 1327 out tokens · 47322 ms · 2026-05-22T10:44:48.876569+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    Agrawal, S., Sarkar, S., Aouedi, O., Yenduri, G., Piamrat, K., Bhattacharya, S., Maddikunta, P.K.R., Gadekallu, T.R.: Federated Learning for Intrusion Detection System: Concepts, Challenges and Future Directions (2021) 30

  2. [2]

    Khan, L.U., Saad, W., Han, Z., Hossain, E., Hong, C.S.: Federated Learning for Internet of Things: Recent Advances, Taxonomy, and Open Challenges (2021)

  3. [3]

    Hu, Y., Zhou, Y., Xiao, J., Wu, C.: GFL: A Decentralized Federated Learning Framework Based On Blockchain (2021)

  4. [4]

    ACM Computing Surveys55(4), 1–35 (2022)

    Qu, Y., Uddin, M.P., Gan, C., Xiang, Y., Gao, L., Yearwood, J.: Blockchain- enabled federated learning: A survey. ACM Computing Surveys55(4), 1–35 (2022)

  5. [5]

    IEEE Transactions on Neural Networks and Learning Systems, 1–21 (2022) https://doi.org/10.1109/ TNNLS.2022.3216981

    Lyu, L., Yu, H., Ma, X., Chen, C., Sun, L., Zhao, J., Yang, Q., Yu, P.S.: Privacy and robustness in federated learning: Attacks and defenses. IEEE Transactions on Neural Networks and Learning Systems, 1–21 (2022) https://doi.org/10.1109/ TNNLS.2022.3216981

  6. [6]

    Future Internet13(4) (2021)

    Fang, H., Qian, Q.: Privacy preserving machine learning with homomorphic encryption and federated learning. Future Internet13(4) (2021)

  7. [7]

    Frontiers of Computer Science18(1), 181336 (2024)

    Liu, F., Zheng, Z., Shi, Y., Tong, Y., Zhang, Y.: A survey on federated learning: a perspective from multi-party computation. Frontiers of Computer Science18(1), 181336 (2024)

  8. [8]

    IEEE Transactions on Emerging Topics in Computing12(1), 269–280 (2024) https: //doi.org/10.1109/TETC.2023.3268186

    Kalapaaking, A.P., Khalil, I., Yi, X.: Blockchain-based federated learning with smpc model verification against poisoning attack for healthcare systems. IEEE Transactions on Emerging Topics in Computing12(1), 269–280 (2024) https: //doi.org/10.1109/TETC.2023.3268186

  9. [9]

    Cryptology ePrint Archive, Paper 2017/396 (2017)

    Mohassel, P., Zhang, Y.: SecureML: A System for Scalable Privacy-Preserving Machine Learning. Cryptology ePrint Archive, Paper 2017/396 (2017)

  10. [10]

    IEEE Transactions on Dependable and Secure Computing20(1), 437–450 (2021)

    Li, X., Qu, Z., Zhao, S., Tang, B., Lu, Z., Liu, Y.: Lomar: A local defense against poisoning attack on federated learning. IEEE Transactions on Dependable and Secure Computing20(1), 437–450 (2021)

  11. [11]

    In: 2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS), pp

    Yan, B., Jiang, X., Chen, Y., Gao, C., Liu, X.: Afl-cs: Asynchronous federated learning with cosine similarity-based penalty term and aggregation. In: 2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS), pp. 46–53 (2023). https://doi.org/10.1109/ICPADS60453.2023.00016

  12. [12]

    IEEE Access11, 7157–7179 (2023) https://doi.org/10.1109/ACCESS.2023.3237554

    Jithish, J., Alangot, B., Mahalingam, N., Yeo, K.S.: Distributed anomaly detec- tion in smart grids: A federated learning-based approach. IEEE Access11, 7157–7179 (2023) https://doi.org/10.1109/ACCESS.2023.3237554

  13. [13]

    Neurocomputing465, 371–390 (2021)

    Zhu, H., Xu, J., Liu, S., Jin, Y.: Federated learning on non-iid data: A survey. Neurocomputing465, 371–390 (2021)

  14. [14]

    IEEE Transactions on Wireless communications21(3), 1927–1942 (2021)

    Zhao, Z., Feng, C., Hong, W., Jiang, J., Jia, C., Quek, T.Q., Peng, M.: Federated 31 learning with non-iid data in wireless networks. IEEE Transactions on Wireless communications21(3), 1927–1942 (2021)

  15. [15]

    Entropy24(5), 686 (2022)

    Li, C.-J., Huang, P.-H., Ma, Y.-T., Hung, H., Huang, S.-Y.: Robust aggregation for federated learning by minimumγ-divergence estimation. Entropy24(5), 686 (2022)

  16. [16]

    IEEE Transactions on Industrial Informatics19(2), 1165–1175 (2021)

    Li, S., Ngai, E., Voigt, T.: Byzantine-robust aggregation in federated learning empowered industrial iot. IEEE Transactions on Industrial Informatics19(2), 1165–1175 (2021)

  17. [17]

    IEEE Transactions on Signal Processing70, 1142–1154 (2022) https: //doi.org/10.1109/TSP.2022.3153135

    Pillutla, K., Kakade, S.M., Harchaoui, Z.: Robust aggregation for federated learning. IEEE Transactions on Signal Processing70, 1142–1154 (2022) https: //doi.org/10.1109/TSP.2022.3153135

  18. [18]

    arXiv preprint arXiv:2212.01976 (2022)

    Jeong, H., Son, H., Lee, S., Hyun, J., Chung, T.-M.: Fedcc: Robust federated learning against model poisoning attacks. arXiv preprint arXiv:2212.01976 (2022)

  19. [19]

    In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, pp

    Wang, N., Xiao, Y., Chen, Y., Hu, Y., Lou, W., Hou, Y.T.: Flare: defending federated learning against model poisoning attacks via latent space representa- tions. In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, pp. 946–958 (2022)

  20. [20]

    Journal of Information Security and Applications87, 103916 (2024)

    Luong, T.D., Tien, V.M., Quyen, N.H., Hien, D.T.T., Duy, P.T., Pham, V.-H.: Fed-lsae: Thwarting poisoning attacks against federated cyber threat detection system via autoencoder-based latent space inspection. Journal of Information Security and Applications87, 103916 (2024)

  21. [21]

    Industrial DevOps,

    Melis, L., Song, C., De Cristofaro, E., Shmatikov, V.: Exploiting unintended fea- ture leakage in collaborative learning. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 691–706 (2019). https://doi.org/10.1109/SP.2019.00029

  22. [22]

    IEEE Transactions on Infor- mation Forensics and Security13(5), 1333–1345 (2018) https://doi.org/10.1109/ TIFS.2017.2787987

    Phong, L.T., Aono, Y., Hayashi, T., Wang, L., Moriai, S.: Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Infor- mation Forensics and Security13(5), 1333–1345 (2018) https://doi.org/10.1109/ TIFS.2017.2787987

  23. [23]

    In: Wallach, H., Larochelle, H., Beygelzimer, A., Alch´ e-Buc, F., Fox, E., Garnett, R

    Zhu, L., Liu, Z., Han, S.: Deep leakage from gradients. In: Wallach, H., Larochelle, H., Beygelzimer, A., Alch´ e-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc., ??? (2019)

  24. [24]

    Taspinar, M

    Ma, Z., Ma, J., Miao, Y., Li, Y., Deng, R.H.: Shieldfl: Mitigating model poisoning attacks in privacy-preserving federated learning. IEEE Transactions on Informa- tion Forensics and Security17, 1639–1654 (2022) https://doi.org/10.1109/TIFS. 2022.3169918

  25. [25]

    IEEE Transactions on Vehicular Technology70(6), 6073–6084 (2021) https://doi.org/10.1109/TVT.2021.3076780

    Liu, H., Zhang, S., Zhang, P., Zhou, X., Shao, X., Pu, G., Zhang, Y.: Blockchain 32 and federated learning for collaborative intrusion detection in vehicular edge com- puting. IEEE Transactions on Vehicular Technology70(6), 6073–6084 (2021) https://doi.org/10.1109/TVT.2021.3076780

  26. [26]

    IEEE Transactions on Industrial Informatics17(8), 5522–5532 (2021) https://doi.org/10.1109/TII

    Rathore, S., Park, J.H.: A blockchain-based deep learning approach for cyber security in next generation industrial cyber-physical systems. IEEE Transactions on Industrial Informatics17(8), 5522–5532 (2021) https://doi.org/10.1109/TII. 2020.3040968

  27. [27]

    Telecommu- nication Systems82(3), 419–433 (2023)

    Shi, Z., Yang, Z., Hassan, A., Li, F., Ding, X.: A privacy preserving federated learning scheme using homomorphic encryption and secret sharing. Telecommu- nication Systems82(3), 419–433 (2023)

  28. [28]

    In: Interpretable Cognitive Internet of Things for Healthcare, pp

    Sahinbas, K., Catak, F.O.: Secure multi-party computation-based privacy- preserving data analysis in healthcare iot systems. In: Interpretable Cognitive Internet of Things for Healthcare, pp. 57–72. Springer, ??? (2023)

  29. [29]

    In: 2017 IEEE Symposium on Security and Privacy (SP), pp

    Mohassel, P., Zhang, Y.: Secureml: A system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38 (2017). IEEE

  30. [30]

    IEEE Transac- tions on Knowledge and Data Engineering31(4), 747–756 (2018)

    Liu, F.: Generalized gaussian mechanism for differential privacy. IEEE Transac- tions on Knowledge and Data Engineering31(4), 747–756 (2018)

  31. [31]

    In: Ishai, Y., Rijmen, V

    Cheu, A., Smith, A., Ullman, J., Zeber, D., Zhilyaev, M.: Distributed differential privacy via shuffling. In: Ishai, Y., Rijmen, V. (eds.) Advances in Cryptology – EUROCRYPT 2019, pp. 375–403. Springer, Cham (2019)

  32. [32]

    Zhang, J., Chen, J., Wu, D., Chen, B., Yu, S.: Poisoning attack in federated learn- ing using generative adversarial nets. In: 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/Big- DataSE), pp. 374–380 (2019). https://doi....

  33. [33]

    IEEE Internet of Things Journal11(5), 7374–7398 (2024) https://doi.org/10.1109/JIOT

    Zhang, J., Chen, B., Cheng, X., Binh, H.T.T., Yu, S.: Poisongan: Generative poisoning attacks against federated learning in edge computing systems. IEEE Internet of Things Journal8(5), 3310–3322 (2021) https://doi.org/10.1109/JIOT. 2020.3023126

  34. [34]

    Xie, Y., Fang, M., Gong, N.Z.: Model Poisoning Attacks to Federated Learning via Multi-Round Consistency (2024)

  35. [35]

    Information Fusion90, 148–173 (2023) https://doi.org/10.1016/j.inffus.2022.09.011 33

    Rodr´ ıguez-Barroso, N., Jim´ enez-L´ opez, D., Luz´ on, M.V., Herrera, F., Mart´ ınez- C´ amara, E.: Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges. Information Fusion90, 148–173 (2023) https://doi.org/10.1016/j.inffus.2022.09.011 33

  36. [36]

    Lin, Z., Shi, Y., Xue, Z.: Idsgan: Generative adversarial networks for attack generation against intrusion detection, 79–91 (2022) https://doi.org/10.1007/ 978-3-031-05981-0 7

  37. [37]

    Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of Neural Network Representations Revisited (2019)

  38. [38]

    Applied Sciences12(19), 9943 (2022)

    Son, H.M., Kim, M.H., Chung, T.-M.: Comparisons where it matters: Using layer- wise regularization to improve federated learning on heterogeneous data. Applied Sciences12(19), 9943 (2022)

  39. [39]

    https://doi.org/10.21227/mbc1-1h68

    Ferrag, M.A., Friha, O., Hamouda, D., Maglaras, L., Janicke, H.: Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applica- tions: Centralized and Federated Learning. https://doi.org/10.21227/mbc1-1h68

  40. [40]

    ICISSp1, 108– 116 (2018) 34

    Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.,et al.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp1, 108– 116 (2018) 34