A Measurement Study of Cryptographic Misuse in Embodied AI Mobile Applications
Pith reviewed 2026-06-26 17:03 UTC · model grok-4.3
The pith
Embodied AI mobile applications exhibit widespread cryptographic misuse driven by domain-specific engineering constraints.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Through analysis of 507 real-world EAI mobile applications across six domains, the study identifies 12,975 cryptographic misuse findings with 80.74% precision. These failures result from EAI-specific constraints including latency-sensitive control paths that weaken transport protection and heavy use of offline provisioning and legacy IoT SDKs that promote credential hardcoding. Real-world cases demonstrate how such flaws allow interception of command channels and hijacking of EAI device control.
What carries the argument
The EAIAppZoo benchmark of 507 applications paired with an automated semantic-aware analysis pipeline that detects five cryptographic failure modes.
If this is right
- Latency-sensitive control paths in EAI apps systematically weaken transport-layer protections.
- Offline device provisioning leads to local hardcoding of authentication credentials.
- Legacy IoT SDKs increase the incidence of hardcoded credentials inside mobile control apps.
- Mobile applications form a fragile cryptographic trust boundary in cyber-physical EAI systems.
- Adversaries can exploit these mobile flaws to bypass nominal network protections and directly control physical EAI entities.
Where Pith is reading between the lines
- Security design for EAI systems may require new patterns that reconcile real-time control requirements with cryptographic needs.
- Audits of cyber-physical systems should treat mobile control applications as a primary rather than secondary target.
- Similar measurement methods could expose comparable issues in other mobile-controlled physical domains such as industrial robotics or building automation.
- The observed trade-offs suggest value in domain-specific cryptographic libraries tuned for offline and low-latency EAI scenarios.
Load-bearing premise
The automated semantic-aware analysis pipeline accurately detects the five major cryptographic failure modes at the reported precision without substantial false positives that would alter the prevalence conclusions.
What would settle it
A manual review of several hundred randomly sampled detections that yields a true precision well below 80.74 percent or shows that the detected failures occur independently of EAI-specific constraints such as latency or offline provisioning.
Figures
read the original abstract
Embodied AI (EAI) mobile applications are evolving from auxiliary user interfaces into active control-path components, directly linking mobile-side cryptographic security to cyber-physical trust. Despite this shift, existing security research predominantly focuses on embodied AI devices and cloud infrastructures, leaving the mobile control layer largely unexplored as a critical attack surface. To bridge this gap, we present the first large-scale measurement study of cryptographic misuse within the EAI mobile ecosystem. We construct EAIAppZoo, a benchmark of 507 real-world applications across six EAI domains, and employ an automated semantic-aware analysis pipeline to measure the prevalence and characteristics of five major cryptographic failure modes. Our measurement yields 12,975 misuse findings (with an evaluated precision of 80.74\%), revealing that these cryptographic failures are driven by EAI-specific engineering constraints rather than random developer errors. We uncover structural security trade-offs: latency-sensitive control paths systematically weaken transport protection, while the heavy reliance on offline device provisioning and legacy IoT SDKs exacerbates the local hardcoding of authentication credentials. Through real-world case studies, we demonstrate how these mobile-side cryptographic flaws bypass nominal network protections, enabling adversaries to intercept command channels and hijack the physical control of EAI entities. Ultimately, our findings highlight that mobile applications have become a fragile, yet overlooked, cryptographic trust boundary in cyber-physical systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents the first large-scale measurement study of cryptographic misuse in Embodied AI (EAI) mobile applications. It constructs the EAIAppZoo benchmark containing 507 real-world apps across six EAI domains and applies an automated semantic-aware analysis pipeline to quantify five major cryptographic failure modes. The study reports 12,975 misuse findings at an evaluated precision of 80.74%, attributes the failures to EAI-specific engineering constraints (latency-sensitive control paths, offline provisioning, legacy IoT SDKs) rather than random errors, and includes case studies showing bypass of network protections leading to physical control hijacking.
Significance. If the pipeline precision and EAI-specific attribution hold, the work is significant as the first focused measurement on the mobile control layer in cyber-physical EAI systems. The scale (507 apps, 12,975 findings) and demonstration of structural trade-offs provide concrete evidence of an overlooked trust boundary. The empirical nature and real-world case studies are strengths, though validation of the detection pipeline is required to support the prevalence and attribution claims.
major comments (1)
- [Abstract / methodology] Abstract and methodology description: the central claim of 12,975 findings and EAI-specific drivers rests on the automated semantic-aware pipeline achieving 80.74% precision. No details are provided on pipeline construction, the five failure-mode detection rules, dataset characteristics (e.g., app selection criteria, domain distribution), or the manual review process used to compute precision (sample size, stratification, false-positive analysis). This prevents verification that the reported count and attribution are not inflated by systematic over-flagging of legacy SDK patterns common in EAI.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for greater methodological transparency. We will revise the manuscript to include a substantially expanded methodology section addressing all points raised, enabling verification of the pipeline, findings, and EAI-specific attribution.
read point-by-point responses
-
Referee: [Abstract / methodology] Abstract and methodology description: the central claim of 12,975 findings and EAI-specific drivers rests on the automated semantic-aware pipeline achieving 80.74% precision. No details are provided on pipeline construction, the five failure-mode detection rules, dataset characteristics (e.g., app selection criteria, domain distribution), or the manual review process used to compute precision (sample size, stratification, false-positive analysis). This prevents verification that the reported count and attribution are not inflated by systematic over-flagging of legacy SDK patterns common in EAI.
Authors: We agree that the submitted manuscript provides insufficient detail on these elements. In the revised version we will add a new subsection (Section 3.2) that: (1) describes the pipeline construction, including the combination of static analysis with semantic context extraction to identify EAI control paths; (2) enumerates the five failure-mode detection rules with examples, explicitly noting how semantic checks distinguish EAI-specific misuse from generic legacy SDK patterns; (3) details dataset characteristics, including app selection criteria (Google Play search with EAI-related keywords followed by manual confirmation of device-control functionality), the six-domain distribution, and exclusion criteria; and (4) reports the manual validation protocol, including the 200-finding stratified sample (by domain and failure mode), reviewer process, and false-positive breakdown showing that flagged legacy-SDK cases were manually inspected and not over-counted. These additions will directly support the prevalence and attribution claims. revision: yes
Circularity Check
No circularity: empirical measurement study with no derivations or fitted predictions
full rationale
The paper is a large-scale empirical measurement of cryptographic misuse across 507 apps using an automated semantic-aware pipeline. It reports raw counts (12,975 findings) and an independently evaluated precision (80.74%) from manual review. No equations, parameters, predictions, or derivations exist that could reduce to inputs by construction. No self-citation chains, ansatzes, or uniqueness theorems are invoked as load-bearing steps. The central claims rest on direct pipeline outputs and external evaluation, satisfying the self-contained benchmark for score 0.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
In: Proceedings of the 21st International Conference on Mining Software Repositories
Alecci, M., Jiménez, P.J.R., Allix, K., Bissyandé, T.F., Klein, J.: Androzoo: A ret- rospective with a glimpse into the future. In: Proceedings of the 21st International Conference on Mining Software Repositories. pp. 389–393 (2024)
2024
-
[2]
In: 2022 IEEE Symposium on Security and Privacy (SP)
Ami, A.S., Cooper, N., Kafle, K., Moran, K., Poshyvanyk, D., Nadkarni, A.: Why crypto-detectors fail: A systematic evaluation of cryptographic misuse detection techniques. In: 2022 IEEE Symposium on Security and Privacy (SP). pp. 614–631. IEEE (2022)
2022
-
[3]
IEEE Transactions on Software Engineering40(6), 617–632 (2014)
Bartel, A., Klein, J., Monperrus, M., Le Traon, Y.: Static analysis for extracting permission checks of a large scale framework: The challenges and solutions for analyzing android. IEEE Transactions on Software Engineering40(6), 617–632 (2014)
2014
-
[4]
In: Proceedings of the 28th International Conference on Evaluation and Assessment in Software En- gineering
Bennett, G., Hall, T., Winter, E., Counsell, S.: Semgrep*: Improving the limited performance of static application security testing (sast) tools. In: Proceedings of the 28th International Conference on Evaluation and Assessment in Software En- gineering. pp. 614–623 (2024)
2024
-
[5]
In: 2018 IEEE International Conference on Robotics and Automation (ICRA)
Bousmalis, K., Irpan, A., Wohlhart, P., Bai, Y., Kelcey, M., Kalakrishnan, M., Downs, L., Ibarz, J., Pastor, P., Konolige, K., Levine, S., Vanhoucke, V.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping. In: 2018 IEEE International Conference on Robotics and Automation (ICRA). pp. 4243–4250 (2018). https://doi.org/10.1...
-
[6]
IEEE Transactions on Reliability68(4), 1384–1403 (2019)
Braga, A., Dahab, R., Antunes, N., Laranjeiro, N., Vieira, M.: Understanding how to use static analysis tools for detecting cryptography misuse in software. IEEE Transactions on Reliability68(4), 1384–1403 (2019). https://doi.org/10.1109/TR. 2019.2937214
work page doi:10.1109/tr 2019
-
[7]
In: Proceedings of the 8thInternationalConferenceonSecurityofInformationandNetworks.pp.322–325 (2015)
Buddhdev, B., Bhan, R., Gaur, M.S., Laxmi, V.: Dynadroid: Dynamic binary in- strumentation based app behavior monitoring framework. In: Proceedings of the 8thInternationalConferenceonSecurityofInformationandNetworks.pp.322–325 (2015)
2015
-
[8]
Calo, R.: The boundaries of privacy harm. Ind. LJ86, 1131 (2011)
2011
-
[9]
Paladyn, Journal of Behavioral Robotics12(1), 160–174 (2020)
Chatzimichali, A., Harrison, R., Chrysostomou, D.: Toward privacy-sensitive human–robot interaction: Privacy terms and human–data interaction in the per- sonal robot era. Paladyn, Journal of Behavioral Robotics12(1), 160–174 (2020). https://doi.org/10.1515/pjbr-2021-0013
-
[10]
In: NDSS (2024)
Chen, Y., Liu, Y., Wu, K.L., Le, D.V., Chau, S.Y.: Towards precise reporting of cryptographic misuses. In: NDSS (2024)
2024
-
[11]
In: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Egele, M., Brumley, D., Fratantonio, Y., Kruegel, C.: An empirical study of crypto- graphic misuse in android applications. In: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. pp. 73–84 (2013)
2013
-
[12]
Com- puters in Human behavior142, 107658 (2023) 18 X
Esterwood, C., Robert Jr, L.P.: Three strikes and you are out!: The impacts of multiple human–robot trust violations and repairs on robot trustworthiness. Com- puters in Human behavior142, 107658 (2023) 18 X. Wang et al
2023
-
[13]
ten deadly sins
Huang, Y., Li, J., Ma, B., Dai, X., Xu, M., Xu, K., Zhang, Y., Wang, J., Cheng, X.: Beyond model jailbreak: Systematic dissection of the “ten deadly sins” in embodied intelligence. arXiv preprint (2025)
2025
-
[14]
Ji, J., Qiu, T., Chen, B., Zhang, B., Lou, H., Wang, K., Duan, Y., He, Z., Vierling, L., Hong, D., Zhou, J., Zhang, Z., Zeng, F., Dai, J., Pan, X., Ng, K.Y., O’Gara, A., Xu, H., Tse, B., Fu, J., McAleer, S., Yang, Y., Wang, Y., Zhu, S.C., Guo, Y., Gao, W.: Ai alignment: A comprehensive survey (2025)
2025
-
[15]
Kaushik, R., Arndt, K., Kyrki, V.: Safeapt: Safe simulation-to-real robot learning usingdiversepolicieslearnedinsimulation.IEEERoboticsandAutomationLetters 7(3), 6838–6845 (2022)
2022
-
[16]
Current robotics reports1(4), 297–309 (2020)
Kok, B.C., Soh, H.: Trust in robots: Challenges and opportunities. Current robotics reports1(4), 297–309 (2020)
2020
-
[17]
Li, K.: Static and dynamic analysis in cryptographic-api misuse detection of mobile application (2021)
2021
-
[18]
In- formation and Software Technology88, 67–95 (2017)
Li, L., Bissyandé, T.F., Papadakis, M., Rasthofer, S., Bartel, A., Octeau, D., Klein, J., Traon, L.: Static analysis of android apps: A systematic literature review. In- formation and Software Technology88, 67–95 (2017)
2017
-
[19]
World Wide Web 21(1), 127–150 (2018)
Liu, Y., Zuo, C., Zhang, Z., Guo, S., Xu, X.: An automatically vetting mechanism for ssl error-handling vulnerability in android hybrid web apps. World Wide Web 21(1), 127–150 (2018)
2018
-
[20]
In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER)
Mauthe, N., Kargén, U., Shahmehri, N.: A large-scale empirical study of android app decompilation. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER). pp. 400–410. IEEE (2021)
2021
-
[21]
https: //cwe.mitre.org/data/definitions/321.html (2026), accessed 15 Mar 2026
MITRE Corporation: CWE-321: Use of Hard-coded Cryptographic Key. https: //cwe.mitre.org/data/definitions/321.html (2026), accessed 15 Mar 2026
2026
-
[22]
Cybersecurity Providing in Information and Telecommunication Systems II 20243826, 206–211 (2024)
Mykhaylova,O.,Fedynyshyn,T.,Platonenko,A.:Hardcodedcredentialsinandroid apps: Service exposure and category-based vulnerability analysis. Cybersecurity Providing in Information and Telecommunication Systems II 20243826, 206–211 (2024)
2024
-
[23]
arXiv preprint (2023)
Nasr, M., Carlini, N., Hayase, J., Jagielski, M., Cooper, A.F., Ippolito, D., Choquette-Choo, C.A., Wallace, E., Tramer, F., Lee, K.: Scalable extraction of training data from (production) language models. arXiv preprint (2023)
2023
-
[24]
IEEE Access8, 106437–106451 (2020)
Qin, J., Zhang, H., Guo, J., Wang, S., Wen, Q., Shi, Y.: Vulnerability detection on android apps–inspired by case study on vulnerability related with web functions. IEEE Access8, 106437–106451 (2020)
2020
-
[25]
Rachum-Twaig, O.: Whose robot is it anyway?: Liability for artificial-intelligence- based robots. U. Ill. L. Rev. p. 1141 (2020)
2020
-
[26]
IEEE Robotics and Automation Letters (2026)
Ravichandran, Z., Robey, A., Kumar, V., Pappas, G.J., Hassani, H.: Safety guardrails for llm-enabled robots. IEEE Robotics and Automation Letters (2026)
2026
-
[27]
In: 2025 IEEE International Conference on Robotics and Au- tomation (ICRA)
Robey, A., Ravichandran, Z., Kumar, V., Hassani, H., Pappas, G.J.: Jailbreaking llm-controlled robots. In: 2025 IEEE International Conference on Robotics and Au- tomation (ICRA). pp. 11948–11956 (2025). https://doi.org/10.1109/ICRA55743. 2025.11128119
-
[28]
In: Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security
Shi, S., Wang, X., Lau, W.C.: Mossot: An automated blackbox tester for single sign-on vulnerabilities in mobile applications. In: Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security. pp. 269–282 (2019)
2019
-
[29]
In: 2021 IEEE 45th Annual Comput- ers, Software, and Applications Conference (COMPSAC)
Singleton, L., Zhao, R., Siy, H., Song, M.: Firebugs: Finding and repairing cryp- tography api misuses in mobile applications. In: 2021 IEEE 45th Annual Comput- ers, Software, and Applications Conference (COMPSAC). pp. 1194–1201 (2021). https://doi.org/10.1109/COMPSAC51774.2021.00165 Cryptographic Misuse in Embodied AI Mobile Apps 19
-
[30]
IET Information Security17(4), 582–597 (2023)
Sun, C., Xu, X., Wu, Y., Zeng, D., Tan, G., Ma, S., Wang, P.: Cryptoeval: Eval- uating the risk of cryptographic misuses in android apps with data-flow analysis. IET Information Security17(4), 582–597 (2023)
2023
-
[31]
arXiv preprint (2025)
Tan, X., Liu, B., Bao, Y., Tian, Q., Gao, Z., Wu, X., Luo, Z., Wang, S., Zhang, Y., Wang, X., et al.: Towards safe and trustworthy embodied ai: foundations, status, and prospects. arXiv preprint (2025)
2025
-
[32]
Winfield, A.F.T., Swana, M., Ives, J., Hauert, S.: On the ethical governance of swarm robotic systems in the real world. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences383(2289), 20240142 (Jan 2025). https://doi.org/10.1098/rsta.2024.0142
-
[33]
cert/ca-cert.pem
Zhang, H., Zhu, C., Wang, X., Zhou, Z., Yin, C., Li, M., Xue, L., Wang, Y., Hu, S., Liu, A., Guo, P., Zhang, L.Y.: Badrobot: Jailbreaking embodied llms in the physical world. arXiv (2024) A Representative Code Snippets A.1 Case 1: Embedded Client Private Key in mTLS Initialization Listing 1.2 shows the embedded client private key used during gRPC channel ...
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.