Crepe: A Mobile Screen Data Collector Using Graph Query
Pith reviewed 2026-05-23 23:48 UTC · model grok-4.3
The pith
Crepe lets researchers collect specific data from Android screens by demonstrating the target once, then uses graph queries to find it on other screens.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that augmenting mobile UI screen structures with a Graph Query technique enables flexible identification, location, and collection of specific data pieces after a user demonstrates the target data on example screens, thereby providing a practical way for researchers to obtain screen content while preserving participant privacy and agency.
What carries the argument
Graph Query technique that augments the structures of mobile UI screens to support flexible identification, location, and collection of specific data pieces
If this is right
- Researchers without programming skills can still gather screen-displayed data for their studies.
- Data collection remains under participant control through transparency and easy opt-out.
- Academic work can proceed independently of commercial data monopolies on mobile content.
- The open-sourced tool can be reused or adapted for additional research projects that need screen information.
Where Pith is reading between the lines
- The same demonstration-plus-query pattern might transfer to iOS or web interfaces if the underlying screen structures can be represented as graphs.
- If the queries prove stable, the method could combine with existing mobile sensing frameworks to create richer consented datasets.
- Longer-term use might reveal whether certain app categories produce persistently higher mismatch rates, pointing to needed refinements in the graph augmentation step.
Load-bearing premise
Demonstrating target data on example screens produces graph queries that reliably identify and extract the intended content across varied apps, layouts, and dynamic screen states without manual tuning or high error rates.
What would settle it
Deploy Crepe on a broad sample of apps and screen states, then measure extraction accuracy and failure rates when queries are generated solely from the initial demonstrations with no further adjustments.
Figures
read the original abstract
Collecting mobile datasets remains challenging for academic researchers due to limited data access and technical barriers. Commercial organizations often possess exclusive access to mobile data, leading to a "data monopoly" that restricts the independence of academic research. Existing open-source mobile data collection frameworks primarily focus on mobile sensing data rather than screen content, which is crucial for various research studies. We present Crepe, a no-code Android app that enables researchers to collect information displayed on screen through simple demonstrations of target data. Crepe utilizes a novel Graph Query technique which augments the structures of mobile UI screens to support flexible identification, location, and collection of specific data pieces. The tool emphasizes participants' privacy and agency by providing full transparency over collected data and allowing easy opt-out. We designed and built Crepe for research purposes only and in scenarios where researchers obtain explicit consent from participants. Code for Crepe will be open-sourced to support future academic research data collection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Crepe, a no-code Android app that enables researchers to collect screen-displayed data through simple demonstrations of target elements. It introduces a novel Graph Query technique that augments mobile UI screen structures to support flexible identification, location, and extraction of specific data pieces, while emphasizing participant privacy, transparency, consent, and opt-out. The tool is positioned as a response to data access barriers and commercial monopolies, with plans to open-source the code for academic use.
Significance. If the Graph Query approach delivers the claimed robustness, Crepe could meaningfully expand independent academic access to mobile screen content data, supporting HCI and related studies that currently rely on limited sensing frameworks. The explicit focus on consent and open-sourcing represents a constructive contribution to research tooling.
major comments (2)
- [Abstract] Abstract and system description: the central claim that a single demonstration produces graph queries that reliably identify and extract target data across varied apps, layouts, and dynamic screen states lacks any supporting implementation details, error rates, robustness metrics, or user studies. This assumption is load-bearing for the contribution.
- [Full manuscript (system description)] No comparison or baseline is provided against existing accessibility tree selectors or tree-query methods, despite the manuscript noting that mobile UI hierarchies are trees; without this, the novelty and necessity of the graph augmentation cannot be assessed.
minor comments (1)
- [Abstract] The abstract would benefit from a concise statement of the query language syntax or augmentation invariants to allow readers to evaluate the claimed flexibility.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We address each major comment below and commit to revisions that strengthen the manuscript's empirical grounding and positioning of the Graph Query contribution.
read point-by-point responses
-
Referee: [Abstract] Abstract and system description: the central claim that a single demonstration produces graph queries that reliably identify and extract target data across varied apps, layouts, and dynamic screen states lacks any supporting implementation details, error rates, robustness metrics, or user studies. This assumption is load-bearing for the contribution.
Authors: We agree that the current manuscript, which centers on system design and the privacy-focused no-code workflow, does not yet provide quantitative robustness metrics or user studies. In the revision we will expand the system description with concrete implementation details on graph construction from UI hierarchies, the query matching algorithm, and preliminary cross-app robustness tests. We will also moderate the abstract's claims to reflect the scope of a system paper while noting planned evaluations. revision: yes
-
Referee: [Full manuscript (system description)] No comparison or baseline is provided against existing accessibility tree selectors or tree-query methods, despite the manuscript noting that mobile UI hierarchies are trees; without this, the novelty and necessity of the graph augmentation cannot be assessed.
Authors: We accept this observation. Although the manuscript explains that graph augmentation enables relations and dynamic matching beyond strict tree traversal, a side-by-side comparison is absent. We will add a dedicated subsection (or table) in Related Work that contrasts Crepe's Graph Query with standard accessibility-tree selectors and existing tree-query techniques, explicitly articulating the added flexibility for cross-layout and dynamic-screen scenarios. revision: yes
Circularity Check
No circularity: system description with no derivations
full rationale
The paper presents Crepe as a no-code Android app for collecting screen data via demonstrations and a graph query technique on UI structures. No equations, parameters, predictions, or derivation chains appear anywhere in the manuscript. The contribution is a practical tool description emphasizing privacy and open-sourcing, with no self-referential logic, fitted inputs renamed as outputs, or load-bearing self-citations that reduce claims to their own inputs. The work is self-contained against external benchmarks as an engineering artifact.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Mobile UI screens can be reliably represented and queried as augmented graphs for data extraction
Forward citations
Cited by 1 Pith paper
-
DroidRetriever: A Transparent and Steerable Automation System for Collaborative Mobile Information Seeking
DroidRetriever is a transparent steerable mobile automation system that decomposes information-seeking tasks with multi-LLM agents, navigates apps, synthesizes reports with screenshots, and provides a dashboard for re...
Reference graph
Works this paper leans on
-
[1]
Nadav Aharony, Wei Pan, Cory Ip, Inas Khayal, and Alex Pentland. 2011. The so- cial fMRI: measuring, understanding, and designing social mechanisms in the real world. In Proceedings of the 13th international conference on Ubiquitous computing . ACM, Beijing China, 445–454. https://doi.org/10.1145/2030112.2030171
-
[2]
Ionut Andone, Konrad Błaszkiewicz, Mark Eibes, Boris Trendafilov, Christian Montag, and Alexander Markowetz. 2016. Menthal: a framework for mobile data collection and analysis. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct . Association for Computing Machinery, New York, NY, USA, 624–629. http...
-
[3]
Imanol Arrieta-Ibarra, Leonard Goff, Diego Jiménez-Hernández, Jaron Lanier, and E. Glen Weyl. 2018. Should We Treat Data as Labor? Moving beyond “Free”.AEA Papers and Proceedings 108 (2018), 38–42. https://www.jstor.org/stable/26452701 Publisher: American Economic Association
-
[4]
Delia Cristina Balaban, Meda Mucundorfeanu, and Larisa Ioana Mures,an. 2022. Adolescents’ Understanding of the Model of Sponsored Content of Social Media Influencer Instagram Stories. Media and Communication 10, 1 (March 2022), 305–316. https://doi.org/10.17645/mac.v10i1.4652
-
[5]
Barbara Ballard. 2007. Designing the mobile user experience . John Wiley & Sons
work page 2007
-
[6]
Jenae Barnes. 2023. Twitter Ends Its Free API: Here’s Who Will Be Af- fected. https://www.forbes.com/sites/jenaebarnes/2023/02/03/twitter-ends- its-free-api-heres-who-will-be-affected/ Section: Business
work page 2023
-
[7]
Dan Calacci and Alex Pentland. 2022. Bargaining with the Black-Box: Designing and Deploying Worker-Centric Tools to Audit Algorithmic Management. Pro- ceedings of the ACM on Human-Computer Interaction 6, CSCW2 (Nov. 2022), 1–24. https://doi.org/10.1145/3570601
-
[8]
Cao, Gang Li, Guoxing Chen, and Biao Chen
Paul Y. Cao, Gang Li, Guoxing Chen, and Biao Chen. 2015. Mobile Data Collection Frameworks: A Survey. In Proceedings of the 2015 Workshop on Mobile Big Data (Mobidata ’15). Association for Computing Machinery, New York, NY, USA, 25–30. https://doi.org/10.1145/2757384.2757396
-
[9]
Giuseppe Cardone, Andrea Cirri, Antonio Corradi, Luca Foschini, and Dario Maio. 2013. MSF: An Efficient Mobile Phone Sensing Framework. International Journal of Distributed Sensor Networks 9, 3 (March 2013), 538937. https://doi.org/ 10.1155/2013/538937 Publisher: SAGE Publications
-
[10]
Vageesh Chandramouli, Abhijnan Chakraborty, Vishnu Navda, Saikat Guha, Venkata Padmanabhan, and Ramachandran Ramjee. 2015. Insider: Towards breaking down mobile app silos. In TRIOS Workshop held in conjunction with the SIGOPS SOSP. Citeseer
work page 2015
-
[11]
Chaoran Chen, Weijun Li, Wenxin Song, Yanfang Ye, Yaxing Yao, and Toby Jia-jun Li. 2024. An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and Behaviors. https://doi.org/10.1145/3613904. 3642363 arXiv:2309.14510 [cs]
-
[12]
Allen Cypher and Daniel Conrad Halbert. 1993. Watch what I Do: Programming by Demonstration. MIT Press
work page 1993
-
[13]
Shaunak De, Abhishek Maity, Vritti Goel, Sanjay Shitole, and Avik Bhattacharya
-
[14]
https://doi.org/10.1109/CSCITA.2017.8066548
Predicting the Popularity of Instagram Posts for a Lifestyle Magazine Using Deep Learning. https://doi.org/10.1109/CSCITA.2017.8066548
-
[15]
Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hibschman, Daniel Afergan, Yang Li, Jeffrey Nichols, and Ranjitha Kumar. 2017. Rico: A Mobile App Dataset for Building Data-Driven Design Applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology . ACM, Québec City QC Canada, 845–854. https://doi.org/10.1145/3126...
-
[16]
Motahhare Eslami, Karrie Karahalios, Christian Sandvig, Kristen Vaccaro, Aimee Rickman, Kevin Hamilton, and Alex Kirlik. 2016. First I "like" it, then I hide it: Folk Theories of Social Feeds. InProceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI ’16). Association for Computing Machinery, New York, NY, USA, 2371–2382. https:/...
-
[17]
I always assumed that I wasn’t really that close to [her]
Motahhare Eslami, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. 2015. "I always assumed that I wasn’t really that close to [her]": Reasoning about Invisible Algorithms in News Feeds. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ’15)...
-
[18]
Dovan Fakhradyan. 2021. The Study of Consumer Preferences and Advertising Effectiveness Analysis Towards Studio Hikari Instagram Story Video Ads. Jurnal Ilmu Sosial Politik dan Humaniora 4, 1 (March 2021), 10–18. https://doi.org/10. 36624/jisora.v4i1.50 Number: 1
work page 2021
-
[19]
Denzil Ferreira, Vassilis Kostakos, and Anind K. Dey. 2015. AWARE: Mobile Context Instrumentation Framework. Frontiers in ICT 2 (2015). https://www. frontiersin.org/articles/10.3389/fict.2015.00006
-
[20]
Andrea Generosi, Silvia Ceccacci, Samuele Faggiano, Luca Giraldi, and Maura Mengoni. 2020. A Toolkit for the Automatic Analysis of Human Behavior in HCI Applications in the Wild. Advances in Science, Technology and Engineering Systems Journal 5, 6 (2020), 185–192. https://doi.org/10.25046/aj050622
-
[21]
René Haldborg Jørgensen, Hilde A.M Voorveld, and Guda van Noort. 2023. Insta- gram Stories: How Ephemerality Affects Consumers’ Responses Toward Insta- gram Content and Advertising. Journal of Interactive Advertising 23, 3 (July 2023), 187–202. https://doi.org/10.1080/15252019.2023.2232797 Publisher: Routledge _eprint: https://doi.org/10.1080/15252019.202...
-
[22]
Alaa Hanbazazh and Carlton Reeve. 2021. Pop-up Ads and Behaviour Patterns: A Quantitative Analysis Involving Perception of Saudi Users. International Journal of Marketing Studies 13, 4 (Nov. 2021), 31. https://doi.org/10.5539/ijms.v13n4p31
-
[23]
Hektner, Jennifer Anne Schmidt, and Mihaly Csikszentmihalyi
Joel M. Hektner, Jennifer Anne Schmidt, and Mihaly Csikszentmihalyi. 2007. Experience Sampling Method: Measuring the Quality of Everyday Life . SAGE. Google-Books-ID: 05e5d_KBYY0C
work page 2007
-
[24]
Benjamin Mako Hill and Andrés Monroy-Hernández. 2017. A longitudinal dataset of five years of public activity in the Scratch online community. Scientific Data 4, 1 (Jan. 2017), 170002. https://doi.org/10.1038/sdata.2017.2 Publisher: Nature Publishing Group
-
[25]
Toby Jia-Jun Li, Yuwen Lu, Jaylexia Clark, Meng Chen, Victor Cox, Meng Jiang, Yang Yang, Tamara Kay, Danielle Wood, and Jay Brockman. 2022. A Bottom-Up End-User Intelligent Assistant Approach to Empower Gig Workers against AI Inequality. https://doi.org/10.48550/arXiv.2204.13842 Publication Title: arXiv e-prints ADS Bibcode: 2022arXiv220413842J
-
[26]
Nima Kordzadeh and Maryam Ghasemaghaei. 2022. Algorithmic bias: review, syn- thesis, and future research directions.European Journal of Information Systems 31, 3 (May 2022), 388–409. https://doi.org/10.1080/0960085X.2021.1927212 Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/0960085X.2021.1927212
-
[27]
Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock. 2014. Experimental evidence of massive-scale emotional contagion through social networks. Pro- ceedings of the National Academy of Sciences of the United States of America 111, 24 (June 2014), 8788–8790. https://doi.org/10.1073/pnas.1320040111
-
[28]
Philipp Krieter. 2019. Can I record your screen? mobile screen recordings as a long-term data source for user studies. In Proceedings of the 18th International Conference on Mobile and Ubiquitous Multimedia (MUM ’19) . Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/ 3365610.3365618
-
[29]
Toby Jia-Jun Li, Amos Azaria, and Brad A. Myers. 2017. SUGILITE: Creating Multimodal Smartphone Automation by Demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems . ACM, Denver Colorado USA, 6038–6049. https://doi.org/10.1145/3025453.3025483
-
[30]
Toby Jia-Jun Li, Jingya Chen, Brandon Canfield, and Brad A Myers. 2020. Privacy- preserving script sharing in gui-based programming-by-demonstration systems. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1–23
work page 2020
-
[31]
Toby Jia-Jun Li, Lindsay Popowski, Tom Mitchell, and Brad A Myers. 2021. Screen2Vec: Semantic Embedding of GUI Screens and GUI Components. In Pro- ceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3411764.3445049
-
[32]
Trishan Panch, Heather Mattie, and Rifat Atun. 2021. Artificial intelligence and algorithmic bias: implications for health systems. Journal of Global Health 9, 2 (June 2021), 020318. https://doi.org/10.7189/jogh.09.020318
-
[33]
Eric A. Posner and E. Glen Weyl. 2018. Radical Markets: Uprooting Capitalism and Democracy for a Just Society. In Radical Markets. Princeton University Press. https://doi.org/10.23943/9781400889457
-
[34]
Mika Raento, Antti Oulasvirta, and Nathan Eagle. 2009. Smartphones: An emerg- ing tool for social scientists. Sociological Methods & Research 37, 3 (2009), 426–454. https://doi.org/10.1177/0049124108330005 Place: US Publisher: Sage Publications
-
[35]
Staff Reddit. 2023. Key Facts to Understanding Reddit’s Recent API Updates - Upvoted. https://www.redditinc.com/blog/apifacts
work page 2023
-
[36]
Thomas N. Robinson, Jorge A. Banda, Lauren Hale, Amy Shirong Lu, Frances Fleming-Milici, Sandra L. Calvert, and Ellen Wartella. 2017. Screen Media Ex- posure and Obesity in Children and Adolescents. Pediatrics 140, Supplement_2 (Nov. 2017), S97–S101. https://doi.org/10.1542/peds.2016-1758K
-
[37]
Schueller, Mark Begale, Frank J
Stephen M. Schueller, Mark Begale, Frank J. Penedo, and David C. Mohr. 2014. Purple: A Modular System for Developing and Deploying Behavioral Intervention Technologies. Journal of Medical Internet Research 16, 7 (July 2014), e3376. https: //doi.org/10.2196/jmir.3376 Company: Journal of Medical Internet Research Distributor: Journal of Medical Internet Res...
-
[38]
Kate Starbird, Jim Maddock, Mania Orand, Peg Achterman, and Robert M. Mason
-
[39]
iConference 2014 Proceedings (March 2014)
Rumors, False Flags, and Digital Vigilantes: Misinformation on Twitter after the 2013 Boston Marathon Bombing. iConference 2014 Proceedings (March 2014). https://doi.org/10.9776/14308 Publisher: iSchools
-
[40]
Kate Starbird and Leysia Palen. 2010. Pass It On?: Retweeting in Mass Emergency. Pass It On (2010)
work page 2010
-
[41]
Sudarat Supanitayanon, Pon Trairatvorakul, and Weerasak Chonchaiya. 2020. Screen media exposure in the first 2 years of life and preschool cognitive de- velopment: a longitudinal study. Pediatric Research 88, 6 (Dec. 2020), 894–902. https://doi.org/10.1038/s41390-020-0831-8 Publisher: Nature Publishing Group
-
[42]
H. Tangmunarunkit, C. K. Hsieh, B. Longstaff, S. Nolen, J. Jenkins, C. Ketcham, J. Selsky, F. Alquaddoomi, D. George, J. Kang, Z. Khalapyan, J. Ooms, N. Ra- manathan, and D. Estrin. 2015. Ohmage: A General and Extensible End-to-End Participatory Sensing Platform. ACM Transactions on Intelligent Systems and Technology 6, 3 (May 2015), 1–21. https://doi.org...
-
[43]
Bennie, Katrien De Cocker, Michael J
George Thomas, Jason A. Bennie, Katrien De Cocker, Michael J. Ireland, and Stuart J. H. Biddle. 2020. Screen-based behaviors in Australian adolescents: Longitudinal trends from a 4-year follow-up study. Preventive Medicine 141 (Dec. 2020), 106258. https://doi.org/10.1016/j.ypmed.2020.106258
-
[44]
Trupthi, Suresh Pabboju, and G
M. Trupthi, Suresh Pabboju, and G. Narasimha. 2017. Sentiment Analysis on Twitter Using Streaming API. In 2017 IEEE 7th International Advance Computing Conference (IACC). 915–919. https://doi.org/10.1109/IACC.2017.0186 ISSN: 2473- 3571
-
[45]
Lena Ulbricht and Karen Yeung. 2022. Algorithmic regulation: A maturing concept for investigating regulation of and through algorithms. Regulation & Governance 16, 1 (2022), 3–22. https://doi.org/10.1111/rego.12437 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/rego.12437
-
[46]
Niels van Berkel, Denzil Ferreira, and Vassilis Kostakos. 2017. The Experience Sampling Method on Mobile Devices. Comput. Surveys 50, 6 (Dec. 2017), 93:1– 93:40. https://doi.org/10.1145/3123988
-
[47]
Hughes, Kate Starbird, and Leysia Palen
Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Mi- croblogging during two natural hazards events: what twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’10) . Association for Computing Machinery, New York, NY, USA, 1079–1088. https://doi.org/10.1145/...
-
[48]
Bryan Wang, Gang Li, and Yang Li. 2023. Enabling Conversational Interaction with Mobile UI using Large Language Models. https://doi.org/10.48550/arXiv. 2209.08655 arXiv:2209.08655 [cs]
work page internal anchor Pith review doi:10.48550/arxiv 2023
-
[49]
Bryan Wang, Gang Li, Xin Zhou, Zhourong Chen, Tovi Grossman, and Yang Li. 2021. Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning. https://doi.org/10.48550/arXiv.2108.03353 arXiv:2108.03353 [cs]
-
[50]
An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, Jianfeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, and Lijuan Wang. 2023. GPT-4V in Wonderland: Large Multimodal Models for Zero- Shot Smartphone GUI Navigation. https://doi.org/10.48550/arXiv.2311.07562 arXiv:2311.07562 [cs]
-
[51]
Instagram Story advertisement title, the brand name above text Sponsored
Zhen Yue, Eden Litt, Carrie J. Cai, Jeff Stern, Kathy K. Baxter, Zhiwei Guan, Nikhil Sharma, and Guangqiang (George) Zhang. 2014. Photographing informa- tion needs: the role of photos in experience sampling method-style research. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). Association for Computing Machinery, N...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.