Semantic Identification of IoT Devices from Behavioral Primitives
Pith reviewed 2026-06-27 06:52 UTC · model grok-4.3
The pith
Semantic matching of behavioral primitives from MUD profiles identifies IoT devices more reliably than exact matching under variable conditions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Semantic ACE matching preserves useful identification evidence across conditions of unseen ACEs, drifted hostnames, and partial runtime observation, while exact ACE matching degrades sharply when overlap becomes sparse. This holds on 1,023 ACE instances from public profiles and on real traffic traces, with semantic methods frequently keeping the correct device among the highest-ranked candidates.
What carries the argument
Semantic ACE matching using geometric representations derived from compact behavioral text of Access Control Entries in MUD profiles.
If this is right
- Device identification remains possible with partial runtime observations where exact matches fail.
- Correct devices stay among top-ranked candidates under sparse overlap and drifted hostnames.
- Semantic matching supplies stronger evidence than exact methods during early stages of traffic observation.
- The representations remain effective after whitening calibration on both public profiles and real traces.
Where Pith is reading between the lines
- The method could support real-time network policy enforcement by identifying devices from limited initial flows.
- Similar semantic techniques on behavioral primitives might extend to anomaly detection or policy violation spotting.
- Integration with dynamically updated MUD profiles could handle evolving device software versions.
Load-bearing premise
Compact behavioral text derived from ACEs produces geometric representations that preserve device-level distinctions more effectively than whole-profile embeddings.
What would settle it
A new dataset of IoT devices where semantic ACE matching ranks the correct device lower than exact matching under sparse-overlap conditions would falsify the central claim.
Figures
read the original abstract
Accurate identification of IoT devices is important for security management and policy enforcement. Existing approaches typically learn device signatures from packets or flow records. These methods operate on low-level communication observations whose traffic patterns may vary across deployments, software versions, and user interactions. This paper studies device identification using Manufacturer Usage Description (MUD) profiles. MUD profiles describe device behavior using Access Control Entries (ACEs), where each ACE represents a behavioral primitive consisting of protocol, endpoint, direction, and port semantics derived from device communication policy. Our contributions are threefold. First, using 28 publicly available MUD profiles containing 1,023 ACE instances, we construct ACE-level semantic representations from compact behavioral text and analyze their geometric properties. ACE-level representations preserve device-level behavioral distinctions more effectively than whole-profile embeddings and remain effective after whitening calibration. Second, we evaluate semantic ACE matching under controlled runtime variations, including unseen ACEs, drifted hostnames, and partial runtime observation. Exact ACE matching performs well when the overlap with the canonical MUD profile remains high, but degrades sharply when the overlap becomes sparse or disappears. In contrast, semantic ACE matching preserves useful identification evidence across these conditions. Third, we evaluate the same approach on real IoT traffic traces comprising more than 800,000 observed flows. Exact overlap remains the strongest signal when stable overlap exists, while semantic ACE matching provides stronger identification evidence during the early stages of observation, frequently retains the correct device among the highest-ranked candidates, and remains effective under sparse-overlap runtime traffic.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that semantic embeddings derived from compact behavioral text of Access Control Entries (ACEs) in MUD profiles enable robust IoT device identification. Using 28 public profiles (1023 ACEs), it shows ACE-level representations preserve device distinctions better than whole-profile embeddings (even after whitening). Controlled tests demonstrate semantic ACE matching retains useful evidence under unseen ACEs, hostname drift, and partial observation, while exact matching degrades with sparse overlap. On 800k real flows, semantic matching aids early-stage and sparse-overlap cases, with exact overlap strongest when stable overlap exists.
Significance. If the results hold, this offers a meaningful advance for IoT security by shifting from variable low-level traffic signatures to standardized, policy-derived behavioral primitives that are more resilient to deployment differences. The geometric analysis supplies an explicit rationale for preferring per-ACE text representations, the controlled-variation experiments directly probe the overlap-sparsity regime, and the scale of the real-traffic corpus (800k flows) provides a concrete empirical basis. The use of external public profiles and real traces avoids circularity in the evaluation.
major comments (2)
- [Abstract] Abstract (first contribution paragraph): the claim that ACE-level representations 'preserve device-level behavioral distinctions more effectively than whole-profile embeddings' is load-bearing for the first contribution, yet the text provides no specification of the embedding model, vectorization technique, or distance metric used to produce or compare the geometric representations.
- [Abstract] Abstract (second and third contribution paragraphs) and evaluation description: the reported superiority of semantic ACE matching under unseen ACEs, drifted hostnames, partial observation, and early-stage real traffic relies on unspecified details of how embeddings are generated, how matching/ranking is performed, and what statistical tests (if any) support the preservation of identification evidence; these omissions leave the central empirical claims only partially verifiable.
minor comments (1)
- [Abstract] The abstract states '28 publicly available MUD profiles' but does not indicate how many distinct devices these profiles represent, which would improve clarity when discussing device-level identification performance.
Simulated Author's Rebuttal
We thank the referee for the constructive review and the recommendation for minor revision. The comments correctly identify that the abstract is insufficiently self-contained regarding methodological details. We address each point below and will revise the abstract (and, if needed, cross-references in the evaluation sections) to improve verifiability while preserving the paper's length and focus.
read point-by-point responses
-
Referee: [Abstract] Abstract (first contribution paragraph): the claim that ACE-level representations 'preserve device-level behavioral distinctions more effectively than whole-profile embeddings' is load-bearing for the first contribution, yet the text provides no specification of the embedding model, vectorization technique, or distance metric used to produce or compare the geometric representations.
Authors: We agree that the abstract should briefly indicate the embedding approach to support the geometric claim. The full manuscript (Section 3) specifies the model, vectorization, and metric; the abstract will be revised to include a concise parenthetical reference to these choices so that the first contribution is verifiable from the abstract alone. revision: yes
-
Referee: [Abstract] Abstract (second and third contribution paragraphs) and evaluation description: the reported superiority of semantic ACE matching under unseen ACEs, drifted hostnames, partial observation, and early-stage real traffic relies on unspecified details of how embeddings are generated, how matching/ranking is performed, and what statistical tests (if any) support the preservation of identification evidence; these omissions leave the central empirical claims only partially verifiable.
Authors: The evaluation sections of the manuscript describe the embedding generation, matching procedure (top-k ranking by cosine similarity), and the use of rank-based metrics with bootstrap confidence intervals. However, the abstract and any high-level evaluation summary do not restate these elements. We will revise the abstract to include a short clause on the matching method and will ensure the evaluation description explicitly names the statistical support (bootstrap intervals) so the claims are fully verifiable without requiring the reader to locate the methods section. revision: yes
Circularity Check
No significant circularity
full rationale
The paper's central claims rest on direct empirical evaluation of ACE-level semantic embeddings against 28 external public MUD profiles (1,023 ACEs) and 800k real flows. Geometric preservation, exact vs. semantic matching under controlled variations (unseen ACEs, hostname drift, partial observation), and ranking performance are computed from these independent datasets without any fitted parameters, self-definitional equations, or load-bearing self-citations that reduce the reported results to quantities defined inside the study. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption MUD profiles describe device behavior using Access Control Entries that represent behavioral primitives consisting of protocol, endpoint, direction, and port semantics.
Reference graph
Works this paper leans on
-
[1]
Tadani Nasser Alyahya, Leonardo Aniello, and Vladimiro Sassone. 2024. ScaNeF-IoT: Scalable Network Fingerprinting for IoT Devices. InProc. ACM ARES. Vienna, Austria
2024
-
[2]
Shayan Azizi et al. 2025. From Flows to Functions: Macroscopic Behavioral Fingerprinting of IoT Devices via Network Services. arXiv:2512.16348 [cs.NI] https://arxiv.org/abs/2512.16348
arXiv 2025
-
[3]
Sylee Beltiukov, Satyandra Guthula, Wenbo Guo, Walter Willinger, and Arpit Gupta. 2025. Demystifying Network Foundation Models. InProc. NeurIPS. Atlanta, Georgia, USA
2025
-
[4]
Jianlyu Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu. 2024. M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation. InFindings of the Association for Computational Linguistics: ACL 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computati...
-
[5]
François De Keersmaeker, Ramin Sadre, and Cristel Pelsser. 2024. Supervising Smart Home Device Interactions: A Profile-Based Firewall Approach. InProc IFIP Networking. Thessaloniki, Greece
2024
-
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. InProc. NAACL. Minneapolis, MN, USA
2019
-
[7]
Hang Guo et al. 2018. IP-Based IoT Device Detection. InProc. ACM Workshop on IoT S&P. Budapest, Hungary
2018
-
[8]
Ayyob Hamza et al. 2022. Verifying and Monitoring IoTs Network Behavior Using MUD Profiles.IEEE Transactions on Dependable and Secure Computing19, 1 (2022), 1–18
2022
-
[9]
Weijia He, Kevin Bryson, Ricardo Calderon, Vijay Prakash, Nick Feamster, Danny Yuxing Huang, and Blase Ur. 2024. Can Allowlists Capture the Variability of Home IoT Device Network Behavior?. InProc. IEEE EuroS&P
2024
-
[10]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. InProc. ACM SIGIR. Virtual Event, China
2020
-
[11]
Eliot Lear, Ralph Droms, and Dan Romascanu. 2019. Manufacturer Usage Description Specification. RFC 8520. doi:10.17487/RFC8520
-
[12]
Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, and Lei Li. 2020. On the Sentence Embeddings from Pre-trained Language Models. InProc EMNLP. Online
2020
-
[13]
X. Lin, G. Xiong, G. Gou, Z. Li, J. Shi, and J. Yu. 2022. ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification. InProc. WWW. Lyon, France
2022
-
[14]
Eman Maali, Omar Alrawi, and Julie McCann. 2025. Evaluating Machine Learning-Based IoT Device Identification Models for Security Applications. InProc. NDSS. San Diego, CA, USA
2025
-
[15]
Luca Morgese Zangrandi, Thijs Van Ede, Tim Booij, Savio Sciancalepore, Luca Allodi, and Andrea Continella. 2022. Stepping Out of the MUD: Contextual Threat Information for IoT Devices with Manufacturer-Provided Behavior Profiles. InProc. ACSAC. Austin, TX, USA
2022
-
[16]
Pashamokhtari et al
A. Pashamokhtari et al. 2022. Combining Stochastic and Deterministic Modeling of IPFIX Records to Infer Connected IoT Devices in Residential ISP Networks.IEEE Internet of Things Journal10, 6 (Nov 2022), 5128–5145
2022
-
[17]
Arman Pashamokhtari, Norihiro Okui, Masataka Nakahara, Ayumu Kubota, Gustavo Batista, and Hassan Habibi Gharakheili. 2023. Dynamic Inference From IoT Traffic Flows Under Concept Drifts in Residential ISP Networks.IEEE IoT Journal10, 17 (Apr 2023), 15761–15773
2023
-
[18]
Aleks Pasquini et al. 2025. Robust and Lightweight Modeling of IoT Network Behaviors from Raw Traffic Packets. IEEE Transactions on Machine Learning in Communications and Networking3 (2025), 98–116
2025
-
[19]
Roberto Perdisci, Thomas Papastergiou, Omar Alrawi, and Manos Antonakakis. 2020. IoTFinder: Efficient Large-Scale Identification of IoT Devices via Passive DNS Traffic Analysis. InProc. IEEE EuroS&P. Genoa, Italy
2020
-
[20]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv:1908.10084 [cs.CL] https://arxiv.org/abs/1908.10084
Pith/arXiv arXiv 2019
-
[21]
Ghorbani
Miraqa Safi, Sajjad Dadkhah, Farzaneh Shoeleh, Hassan Mahdikhani, Heather Molyneaux, and Ali A. Ghorbani. 2022. A Survey on IoT Profiling, Fingerprinting, and Identification.ACM TIOT3, 4, Article 26 (Sep 2022), 39 pages
2022
-
[22]
Said Jawad Saidi et al. 2020. A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild. InProc. IMC. Virtual Event, USA
2020
-
[23]
Rahul Anand Sharma, Elahe Soltanaghaei, Anthony Rowe, and Vyas Sekar. 2022. Lumos: Identifying and Localizing Diverse Hidden IoT Devices in an Unfamiliar Environment. InProc. USENIX Security. Boston, MA, USA
2022
-
[24]
Arunan Sivanathan et al. 2020. Detecting Behavioral Change of IoT Devices Using Clustering-Based Network Traffic Modeling.IEEE Internet of Things Journal7, 8 (Mar 2020), 7295–7309
2020
-
[25]
Sivanathan, H
A. Sivanathan, H. Habibi Gharakheili, F. Loi, A. Radford, C. Wijenayake, A. Vishwanath, and V. Sivaraman. 2019. Classifying IoT Devices in Smart Environments using Network Traffic Characteristics.IEEE Transactions on Mobile Computing18, 8 (2019), 1745–1759
2019
-
[26]
Sheng, Minh Tran, Ben Luo, Daniel Coscia, Gustavo Batista, and Hassan Habibi Gharakaheili
Arunan Sivanathan, Deepak Mishra, Sushmita Ruj, Natasha Fernandes, Quan Z. Sheng, Minh Tran, Ben Luo, Daniel Coscia, Gustavo Batista, and Hassan Habibi Gharakaheili. 2026. Real-Time and Trustworthy Classification of IoT 12 S. Witt and H. Habibi Gharakheili Traffic Using Lightweight Deep Learning.IEEE Transactions on Network Science and Engineering13 (2026...
2026
-
[27]
Jianlin Su, Jiarun Cao, Weijie Liu, and Yangyiwen Ou. 2021. Whitening Sentence Representations for Better Semantics and Faster Retrieval. arXiv:2103.15316 [cs.CL] https://arxiv.org/abs/2103.15316
arXiv 2021
-
[28]
T. Wang, X. Xie, W. Wang, C. Wang, Y. Zhao, and Y. Cui. 2024. NetMamba: Efficient network traffic classification via pre-training unidirectional Mamba. arXiv:2405.11449
arXiv 2024
-
[29]
Savindu Wannigama, Arunan Sivanathan, and Hassan Habibi Gharakheili. 2025. Descriptor: UNSW IoT Traffic Data with Packets, Flows, and Protocols (UNSW-IoTraffic).IEEE Data Descriptions2 (Aug 2025), 311–323
2025
-
[30]
Samuel Witt. 2026. Semantic IoT Behavior. https://github.com/gonzow9/Semantic-IoT-Behavior. GitHub repository. Accessed: 2026-06-11
2026
-
[31]
Lingjing Yu, Bo Luo, Jun Ma, Zhaoyu Zhou, and Qingyun Liu. 2020. You Are What You Broadcast: Identification of Mobile and IoT Devices from (Public) WiFi. InProc. USENIX Security. Boston, MA, USA
2020
-
[32]
Ruijie Zhao, Mingwei Zhan, Xianwen Deng, Yanhao Wang, Yijun Wang, Guan Gui, and Zhi Xue. 2023. Yet Another Traffic Classifier: A Masked Autoencoder Based Traffic Transformer with Multi-Level Flow Representation.Proceedings of the AAAI Conference on Artificial Intelligence37 (06 2023), 5420–5427. doi:10.1609/aaai.v37i4.25674
-
[33]
Jiawei Zhou, Woojeong Kim, Zhiying Xu, Alexander M. Rush, and Minlan Yu. 2024. NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics. arXiv:2412.20635 [cs.LG] https://arxiv.org/abs/2412.20635 Semantic Identification of IoT Devices from Behavioral Primitives 13 A Generative AI Usage Statement The authors developed, verified, and inter...
arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.