AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction
Pith reviewed 2026-06-26 08:39 UTC · model grok-4.3
The pith
AOHP builds an Android harness that treats AI agents as first-class OS actors to raise task completion, cut token use, and tighten security.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AOHP is an OS-level agent harness on AOSP that treats agents as first-class actors, enabling adaptive user interfaces and agent-friendly runtime environments through personalized service composition, efficient agent interfaces, and secure information flow; on challenging tasks these mechanisms deliver measurable gains in task completion, execution cost, and security compliance over standard Android.
What carries the argument
The three agent-oriented system mechanisms (personalized service composition, efficient agent interfaces, and secure information flow) that let agents interact directly with the OS as first-class actors.
If this is right
- Agents complete more tasks on the same device without extra hardware.
- Token consumption drops, lowering both latency and monetary cost of agent runs.
- Security policies are enforced more reliably during agent actions.
- Developers can reuse existing Android apps and drivers while gaining agent support.
- The open codebase supplies a shared platform for testing further agent-native primitives.
Where Pith is reading between the lines
- Future mobile OS versions could incorporate similar first-class agent support as a standard feature.
- Comparable harnesses on other platforms might reveal whether the three mechanisms generalize beyond Android.
- Longer-running agent sessions could be benchmarked to check whether the reported efficiency gains persist over time.
Load-bearing premise
The preliminary experiments on a set of challenging tasks are representative enough to establish the advantages of the three proposed mechanisms over conventional Android.
What would settle it
A broader suite of tasks on which AOHP fails to improve completion rate, token cost, or security compliance relative to unmodified Android.
read the original abstract
AI agents are driving a new software paradigm, with the ability to autonomously call tools, extract information, manage memory, and complete tasks that span applications and data sources. Most existing end-user operating systems, however, are designed for application-centric workflows and offer little native support for AI agents. This mismatch limits the wider adoption of agents and leads to execution overhead and safety risks when running agents on conventional systems. While the concept of agent-native operating systems is emerging, the research community lacks an open testbed to explore the architectural primitives desired for agent-mediated interaction. We present AOHP (Android Open Harness Project), an OS-level agent harness built on the Android Open Source Project (AOSP). The core design principle of AOHP is to treat agents as first-class OS actors, enabling adaptive user interfaces and agent-friendly runtime environments. AOHP preserves the mature Android software and hardware ecosystem while introducing three agent-oriented system mechanisms: personalized service composition, efficient agent interfaces, and secure information flow. Based on preliminary experiments on challenging tasks covering key capabilities of OS agents, AOHP shows clear advantages in task completion (+21.12% completion rate), execution cost (-51.55% token cost), and security-policy compliance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents AOHP, an open-source OS-level agent harness built on the Android Open Source Project (AOSP). It treats agents as first-class OS actors with three agent-oriented mechanisms: personalized service composition, efficient agent interfaces, and secure information flow. Preliminary experiments on challenging tasks show advantages in task completion rate (+21.12%), execution cost (-51.55% token cost), and security-policy compliance compared to conventional Android.
Significance. If the results hold, AOHP provides a valuable open testbed for the research community to explore architectural primitives for agent-mediated interaction in operating systems. It preserves the Android ecosystem while addressing the mismatch with AI agent workflows, potentially reducing overhead and safety risks. The open-source nature, explicit task descriptions, and direct tying of quantitative deltas to the three proposed mechanisms are strengths.
minor comments (2)
- [Abstract] Abstract: the performance deltas are stated without any mention of experimental design, number of tasks/trials, or baselines; a one-sentence summary of the evaluation setup would improve standalone readability.
- [§4] §4 (Evaluation): confirm that the reported +21.12% and -51.55% figures are accompanied by per-task breakdowns, variance measures, and explicit comparison to the standard Android baseline so readers can assess representativeness.
Simulated Author's Rebuttal
We thank the referee for the supportive summary, recognition of AOHP's potential value as an open testbed, and recommendation of minor revision. The referee's description of the work is accurate.
Circularity Check
No significant circularity
full rationale
The paper presents AOHP as an OS-level harness introducing three agent-oriented mechanisms (personalized service composition, efficient agent interfaces, secure information flow) and reports empirical advantages from preliminary experiments on tasks. No mathematical derivations, first-principles predictions, fitted parameters renamed as outputs, or self-citation chains appear. The central claims rest on direct experimental comparisons to standard Android baselines, which are externally falsifiable and not reduced to the paper's own inputs by construction. This is a standard systems/empirical contribution with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Agent s: An open agentic framework that uses computers like a human
Saaket Agashe, Jiuzhou Han, Shuyu Gan, Jiachen Yang, Ang Li, and Xin Wang. Agent s: An open agentic framework that uses computers like a human. InInternational Conference on Learning Representations, volume 2025, pages 22924–22946, 2025
2025
-
[2]
Claude Code
Anthropic. Claude Code. https://docs.anthropic.com/en/docs/claude-code/ overview, 2026. Accessed: 2026-06-11
2026
-
[3]
Gilles Baechler, Srinivas Sunkara, Maria Wang, Fedir Zubach, Hassan Mansoor, Vincent Etter, Victor Cărbune, Jason Lin, Jindong Chen, and Abhanshu Sharma. Screenai: A vision-language model for ui and infographics understanding.arXiv preprint arXiv:2402.04615, 2024
-
[4]
Seeclick: Harnessing gui grounding for advanced visual gui agents
Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Li YanTao, Jianbing Zhang, and Zhiyong Wu. Seeclick: Harnessing gui grounding for advanced visual gui agents. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9313–9332, 2024
2024
-
[5]
Securing AI Agents with Information-Flow Control
Manuel Costa, Boris Köpf, Aashish Kolluri, Andrew Paverd, Mark Russinovich, Ahmed Salem, Shruti Tople, Lukas Wutschitz, and Santiago Zanella-Béguelin. Securing ai agents with information-flow control.arXiv preprint arXiv:2505.23643, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[6]
Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents.Advances in Neural Information Processing Systems, 37:82895–82920, 2024
Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents.Advances in Neural Information Processing Systems, 37:82895–82920, 2024. 12
2024
-
[7]
Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones.ACM Transactions on Computer Systems (TOCS), 32(2):1–29, 2014
William Enck, Peter Gilbert, Seungyeop Han, Vasant Tendulkar, Byung-Gon Chun, Landon P Cox, Jaeyeon Jung, Patrick McDaniel, and Anmol N Sheth. Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones.ACM Transactions on Computer Systems (TOCS), 32(2):1–29, 2014
2014
-
[8]
Android open source project.https://source.android.com/, 2026
Google. Android open source project.https://source.android.com/, 2026
2026
-
[9]
Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. InProceedings of the 16th ACM workshop on artificial intelligence and security, pages 79–90, 2023
2023
-
[10]
arXiv preprint arXiv:2512.19432 , year=
Quyu Kong, Xu Zhang, Zhenyu Yang, Nolan Gao, Chen Liu, Panrong Tong, Chenglin Cai, Hanzhang Zhou, Jianan Zhang, Liangyu Chen, et al. Mobileworld: Benchmarking autonomous mobile agents in agent-user interactive and mcp-augmented environments.arXiv preprint arXiv:2512.19432, 2025
-
[11]
Mapping natural language instructionstomobileuiactionsequences
Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, and Jason Baldridge. Mapping natural language instructionstomobileuiactionsequences. InProceedingsofthe58thannualmeetingoftheassociation for computational linguistics, pages 8198–8210, 2020
2020
-
[12]
Droidbot: a lightweight ui-guided test input generator for android
Yuanchun Li, Ziyue Yang, Yao Guo, and Xiangqun Chen. Droidbot: a lightweight ui-guided test input generator for android. In2017 IEEE/ACM 39th international conference on software engineering companion (ICSE-C), pages 23–26. IEEE, 2017
2017
-
[13]
OpenClaw.https://docs.openclaw.ai/, 2026
OpenClaw Contributors. OpenClaw.https://docs.openclaw.ai/, 2026. Accessed: 2026-06- 11
2026
-
[14]
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Yujia Qin, Yining Ye, Junjie Fang, Haoming Wang, Shihao Liang, Shizuo Tian, Junda Zhang, Jiahao Li, Yunxin Li, Shijue Huang, et al. Ui-tars: Pioneering automated gui interaction with native agents. arXiv preprint arXiv:2501.12326, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[15]
Androidworld: A dynamic benchmarking environment for autonomous agents
ChrisRawles, SarahClinckemaillie, YifanChang, JonathanWaltz, GabrielleLau, MarybethFair, Alice Li, William Bishop, Wei Li, Folawiyo Campbell-Ajala, et al. Androidworld: A dynamic benchmarking environment for autonomous agents. InInternational Conference on Learning Representations, volume 2025, pages 406–441, 2025
2025
-
[16]
Mobile-agent-v2: Mobile device operation assistant with effective navigation via multi-agent collaboration.Advances in Neural Information Processing Systems, 37:2686–2710, 2024
Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, and Jitao Sang. Mobile-agent-v2: Mobile device operation assistant with effective navigation via multi-agent collaboration.Advances in Neural Information Processing Systems, 37:2686–2710, 2024
2024
-
[17]
Autodroid: Llm-powered task automation in android
Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, and Yunxin Liu. Autodroid: Llm-powered task automation in android. In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking, pages 543–557, 2024
2024
-
[18]
Autodroid-v2: Boosting slm-based gui agents via code generation
HaoWen, ShizuoTian, BorislavPavlov, WenjieDu, YixuanLi, GeChang, ShanhuiZhao, JiachengLiu, Yunxin Liu, Ya-Qin Zhang, et al. Autodroid-v2: Boosting slm-based gui agents via code generation. InProceedings of the 23rd Annual International Conference on Mobile Systems, Applications and Services, pages 223–235, 2025. 13
2025
-
[19]
Fangzhou Wu, Ethan Cecchetti, and Chaowei Xiao. System-level defense against indirect prompt injection attacks: An information flow control perspective.arXiv preprint arXiv:2409.19091, 2024
-
[20]
Os-copilot: Towards generalist computer agents with self-improvement, 2024
Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, and Lingpeng Kong. Os-copilot: Towards generalist computer agents with self-improvement, 2024
2024
-
[21]
Os-atlas: Foundation action model for generalist gui agents
Zhiyong Wu, Zhenyu Wu, Fangzhi Xu, Yian Wang, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen Ding, Liheng Chen, Paul Pu Liang, et al. Os-atlas: Foundation action model for generalist gui agents. InInternational Conference on Learning Representations, volume 2025, pages 5090–5108, 2025
2025
-
[22]
Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments.Advances in Neural Information Processing Systems, 37:52040–52094, 2024
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh J Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, et al. Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments.Advances in Neural Information Processing Systems, 37:52040–52094, 2024
2024
-
[23]
Androidlab: Training and systematic benchmarking of android autonomous agents
Yifan Xu, Xiao Liu, Xueqiao Sun, Siyi Cheng, Hao Yu, Hanyu Lai, Shudan Zhang, Dan Zhang, Jie Tang, and Yuxiao Dong. Androidlab: Training and systematic benchmarking of android autonomous agents. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2144–2166, 2025
2025
-
[24]
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Yiheng Xu, Zekun Wang, Junli Wang, Dunjie Lu, Tianbao Xie, Amrita Saha, Doyen Sahoo, Tao Yu, and Caiming Xiong. Aguvis: Unified pure vision agents for autonomous gui interaction.arXiv preprint arXiv:2412.04454, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[25]
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models.arXiv preprint arXiv:2210.03629, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[26]
Mobile-Agent-v3: Fundamental Agents for GUI Automation
Jiabo Ye, Xi Zhang, Haiyang Xu, Haowei Liu, Junyang Wang, Zhaoqing Zhu, Ziwei Zheng, Feiyu Gao, Junjie Cao, Zhengxi Lu, et al. Mobile-agent-v3: Fundamental agents for gui automation.arXiv preprint arXiv:2508.15144, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[27]
Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents
Qiusi Zhan, Zhixiang Liang, Zifan Ying, and Daniel Kang. Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents. InFindings of the Association for Computational Linguistics: ACL 2024, pages 10471–10506, 2024
2024
-
[28]
Appagent: Multimodal agents as smartphone users
Chi Zhang, Zhao Yang, Jiaxuan Liu, Yanda Li, Yucheng Han, Xin Chen, Zebiao Huang, Bin Fu, and Gang Yu. Appagent: Multimodal agents as smartphone users. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–20, 2025
2025
-
[29]
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Boyuan Zheng, Boyu Gou, Jihyung Kil, Huan Sun, and Yu Su. Gpt-4v (ision) is a generalist web agent, if grounded.arXiv preprint arXiv:2401.01614, 2024. 14 A. Benchmark Tasks Our benchmark comprises 30 real-world mobile tasks grouped into five core capability categories plus a hybrid category that composes them, with five tasks each. Table 3 lists all tasks...
work page internal anchor Pith review Pith/arXiv arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.