pith. sign in

arxiv: 2604.03264 · v1 · submitted 2026-03-12 · 💻 cs.CV · cs.AI· cs.CR

SafeScreen: A Safety-First Screening Framework for Personalized Video Retrieval for Vulnerable Users

Pith reviewed 2026-05-15 11:20 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.CR
keywords safety-first screeningpersonalized video retrievalvulnerable usersdementia caremultimodal analysisLLM decision makingadaptive question generation
0
0 comments X

The pith

SafeScreen screens videos against each user's individual safety rules before any content is shown.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SafeScreen as a pipeline that extracts safety criteria from a user profile, generates adaptive questions about candidate videos, analyzes them multimodally, and lets an LLM approve or reject each video in sequence. This approach treats safety compliance as a hard prerequisite rather than a post-ranking filter. Standard platforms optimize for engagement and can surface unsuitable material for children or dementia patients; SafeScreen instead produces a shortlist that already satisfies the profile constraints. If the method works, open video repositories become usable in care and education settings without requiring pre-labeled safe content or manual review for every user.

Core claim

SafeScreen retrieves and presents personalized videos by first deriving individualized safety criteria from a user profile, then performing sequential approval through adaptive question generation, multimodal VideoRAG evidence collection, and LLM-based verification of safety, appropriateness, and relevance; the result is an explainable decision for each candidate that prioritizes constraint satisfaction over engagement signals.

What carries the argument

The sequential approval pipeline that extracts profile-driven safety criteria and verifies them via adaptive question generation plus multimodal video analysis before any exposure occurs.

If this is right

  • Candidate videos are approved or rejected one at a time rather than ranked by popularity or relevance.
  • The output list diverges from engagement-optimized rankings in the large majority of test cases.
  • Safety, sensibleness, and groundedness scores remain high when checked by both automated and human evaluators.
  • The method works on uncurated repositories without needing precomputed safety labels for each video.
  • The same pipeline supports different care contexts by swapping the profile criteria.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could be applied to other domains such as educational video selection for young learners if new safety criteria are defined.
  • Real-time profile updates would allow the screening decisions to adapt as a user's needs or sensitivities change over time.
  • Integration into existing platforms would shift the default from engagement-first to constraint-first retrieval for designated vulnerable accounts.

Load-bearing premise

LLM-based decisions guided by adaptive questions and multimodal analysis will catch harmful content and avoid approving unsafe videos for the specific user profile.

What would settle it

A controlled test in which domain experts review a set of videos containing subtle risks and check whether the system approves any of those videos or rejects clearly safe ones that meet the stated profile criteria.

Figures

Figures reproduced from arXiv: 2604.03264 by Fengpei Yuan, Madhava Kalyan Gadiputi, Wenzheng Zhao.

Figure 1
Figure 1. Figure 1: Conceptual comparison between conventional [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Complete SafeScreen framework overview showing the three-stage pipeline: [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: SafeScreen deployment contexts: clinical integration (left) and systematic evaluation (right). [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
read the original abstract

Open-domain video platforms offer rich, personalized content that could support health, caregiving, and educational applications, but their engagement-optimized recommendation algorithms can expose vulnerable users to inappropriate or harmful material. These risks are especially acute in child-directed and care settings (e.g., dementia care), where content must satisfy individualized safety constraints before being shown. We introduce SafeScreen, a safety-first video screening framework that retrieves and presents personalized video while enforcing individualized safety constraints. Rather than ranking videos by relevance or popularity, SafeScreen treats safety as a prerequisite and performs sequential approval or rejection of candidate videos through an automated pipeline. SafeScreen integrates three key components: (i) profile-driven extraction of individualized safety criteria, (ii) evidence-grounded assessments via adaptive question generation and multimodal VideoRAG analysis, and (iii) LLM-based decision-making that verifies safety, appropriateness, and relevance before content exposure. This design enables explainable, real-time screening of uncurated video repositories without relying on precomputed safety labels. We evaluate SafeScreen in a dementia-care reminiscence case study using 30 synthetic patient profiles and 90 test queries. Results demonstrate that SafeScreen prioritizes safety over engagement, diverging from YouTube's engagement-optimized rankings in 80-93% of cases, while maintaining high levels of safety coverage, sensibleness, and groundedness, as validated by both LLM-based evaluation and domain experts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces SafeScreen, a safety-first screening framework for personalized video retrieval aimed at vulnerable users (e.g., dementia care). It extracts individualized safety criteria from user profiles, performs evidence-grounded assessments via adaptive question generation and multimodal VideoRAG, and uses LLM-based decision-making to approve or reject candidate videos before exposure. Rather than optimizing for engagement, the system treats safety as a prerequisite. Evaluation on 30 synthetic patient profiles and 90 test queries reports 80-93% divergence from YouTube's engagement-optimized rankings while claiming high safety coverage, sensibleness, and groundedness, validated by LLM-as-judge metrics and domain experts.

Significance. If the core pipeline reliably enforces individualized constraints without missing harmful content, SafeScreen could enable safer deployment of open video platforms in caregiving and educational settings. The design's emphasis on explainable, real-time screening without precomputed labels is a constructive contribution, but the current evaluation's dependence on synthetic profiles and internal LLM judgments provides limited evidence that the approach generalizes to real individualized safety needs.

major comments (2)
  1. [Evaluation] Evaluation section: safety coverage, sensibleness, and groundedness are defined and scored by the same LLM pipeline used in the screening system itself, creating circularity that does not independently measure missed harmful content or false approvals on the 90 test queries.
  2. [Evaluation] Evaluation section: the headline claim of reliable individualized safety verification rests on 30 synthetic profiles and LLM/expert judgments with no real-user validation, no baseline comparisons to other safety filters, and no quantification of LLM judgment error, which is load-bearing for the assertion that the framework works in dementia-care settings.
minor comments (2)
  1. [Abstract] Abstract and Evaluation: the 80-93% divergence range should be reported with per-profile or per-query breakdowns and confidence intervals rather than as a single aggregate.
  2. [Evaluation] The manuscript should clarify the exact prompting strategy and model versions used for both the screening pipeline and the LLM-as-judge evaluation to allow reproducibility.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback on the evaluation of SafeScreen. We address each major comment point by point below, with revisions incorporated where feasible to improve clarity and evidence.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: safety coverage, sensibleness, and groundedness are defined and scored by the same LLM pipeline used in the screening system itself, creating circularity that does not independently measure missed harmful content or false approvals on the 90 test queries.

    Authors: We acknowledge the risk of circularity when the same LLM pipeline contributes to both screening decisions and automated evaluation metrics. The original manuscript already includes independent validation by domain experts on a subset of the 90 queries, which we have now expanded in the revised version with a dedicated subsection detailing expert agreement rates, inter-rater reliability, and specific cases where expert review overrode or confirmed LLM outputs. This provides an external check on missed harmful content and false approvals. We have also added a limitations paragraph discussing LLM-as-judge biases. revision: partial

  2. Referee: [Evaluation] Evaluation section: the headline claim of reliable individualized safety verification rests on 30 synthetic profiles and LLM/expert judgments with no real-user validation, no baseline comparisons to other safety filters, and no quantification of LLM judgment error, which is load-bearing for the assertion that the framework works in dementia-care settings.

    Authors: We agree that reliance on 30 synthetic profiles constitutes a limitation for claims about real dementia-care deployment. The revised manuscript now includes explicit baseline comparisons against rule-based keyword filters and simple multimodal classifiers, with quantitative results showing SafeScreen's divergence and safety gains. We have also added quantification of LLM judgment error via agreement statistics with the domain experts (e.g., Cohen's kappa and disagreement cases). Real-user validation with vulnerable populations is not feasible within the scope of this work due to ethical and IRB constraints; we explicitly frame the current study as a controlled proof-of-concept and outline planned clinical trials as future work. revision: partial

standing simulated objections not resolved
  • Real-user validation with actual vulnerable users (e.g., dementia patients) due to ethical and regulatory requirements

Circularity Check

1 steps flagged

LLM-based evaluation of safety decisions shares the same model class as the screening pipeline, risking circular overestimation of reliability

specific steps
  1. other [Abstract (Results paragraph)]
    "Results demonstrate that SafeScreen prioritizes safety over engagement, diverging from YouTube's engagement-optimized rankings in 80-93% of cases, while maintaining high levels of safety coverage, sensibleness, and groundedness, as validated by both LLM-based evaluation and domain experts."

    Safety coverage, sensibleness, and groundedness are defined and scored via the same LLM-based decision-making and adaptive question generation used inside the SafeScreen pipeline itself; the evaluator therefore risks reproducing the pipeline's own reasoning patterns rather than providing an independent check on missed harmful content or false approvals.

full rationale

The paper's central results (80-93% divergence from YouTube plus high safety coverage/sensibleness/groundedness) rest on LLM-as-judge validation of the outputs produced by an LLM-driven pipeline (profile extraction, adaptive question generation, VideoRAG analysis, and decision-making). While divergence from YouTube rankings can be measured externally, the safety metrics are generated and scored inside the same LLM reasoning loop on synthetic profiles, creating partial circularity in the validation of individualized safety enforcement. No equations or self-citations reduce the derivation by construction, so the circularity is moderate rather than total.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework depends on the untested premise that current multimodal LLMs can perform reliable individualized safety verification; no free parameters or new entities are introduced, but the core decision step rests on domain assumptions about LLM capability.

axioms (1)
  • domain assumption LLMs can generate accurate, grounded safety and appropriateness judgments from video content and user profiles
    Invoked in the LLM-based decision-making component and evaluation validation without external calibration or error bounds.

pith-pipeline@v0.9.0 · 5561 in / 1342 out tokens · 30008 ms · 2026-05-15T11:20:52.087722+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 2 internal anchors

  1. [1]

    Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, et al

  2. [2]

    InAdvances in Neural Information Processing Systems, Vol

    Flamingo: A visual language model for few-shot learning. InAdvances in Neural Information Processing Systems, Vol. 35. 23716–23736

  3. [3]

    Jan Batzner, Volker Stocker, Bingjun Tang, Anusha Natarajan, Qinhao Chen, Stefan Schmid, and Gjergji Kasneci. 2025. Whose Personae? Synthetic Persona Experiments in LLM Research and Pathways to Transparency. InProceedings of the Eighth AAAI/ACM Conference on AI, Ethics, and Society. AAAI, 343–354

  4. [4]

    Collins, Karel G

    Gary S. Collins, Karel G. M. Moons, Paula Dhiman, Richard D. Riley, Andrew L. Beam, Ben Van Calster, et al. 2024. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods.BMJ385 (2024), e078378. doi:10.1136/bmj-2024-078378

  5. [5]

    Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for YouTube recommendations. InProceedings of the Tenth ACM Conference on Recommender Systems. 191–198. doi:10.1145/2959100.2959190

  6. [6]

    Norah L Crossnohere, Mohamed Elsaid, Jonathan Paskett, Seuli Bose-Brill, and John F P Bridges. 2022. Guidelines for artificial intelligence in medicine: literature review and content analysis of frameworks.Journal of Medical Internet Research 24, 8 (2022), e36823. doi:10.2196/36823

  7. [7]

    James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, and Dasarathi Sampath. 2010. The YouTube video recommendation system. InProceedings of the Fourth ACM Conference on Recommender Systems. 293–296. doi:10.1145/1864708. 1864770

  8. [8]

    Anne A H de Hond, Artuur M Leeuwenberg, Lotty Hooft, Ilse M J Kant, Steven W J Nijman, Hendrikus J A van Os, Jiska J Aardoom, Thomas P A Debray, Ewoud Schuit, Maarten van Smeden, Johannes B Reitsma, Ewout W Steyerberg, Niels H Chavannes, and Karel G M Moons. 2022. Guidelines and quality criteria for artificial intelligence-based prediction models in healt...

  9. [9]

    JMIR Mental Health 4(2), e19 (2017).https://doi.org/10.2196/mental.7785 SLIP & ETHICS: Graduated Intervention for AI Emotional Companions 11

    Kathleen K. Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): A randomized controlled trial.JMIR Mental Health4, 2 (2017), e19. doi:10.2196/mental.7785

  10. [10]

    Google LLC. 2015. YouTube Kids. https://www.youtubekids.com/. Accessed: 2025-01

  11. [11]

    Robert Gorwa, Reuben Binns, and Christian Katzenbach. 2020. Algorithmic content moderation: Technical and political challenges in the automation of platform governance.Big Data & Society7, 1 (2020), 2053951719897945. doi:10. 1177/2053951719897945

  12. [12]

    Kilem L. Gwet. 2014.Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters(4 ed.). Advanced Analytics, LLC, Gaithersburg, MD

  13. [13]

    Becky Inkster, Shubham Sarda, and Vinod Subramanian. 2018. An empathy- driven, conversational artificial intelligence agent (Wysa) for digital mental well-being: Real-world data evaluation mixed-methods study.JMIR mHealth and uHealth6, 11 (2018), e12106. doi:10.2196/mhealth.9785

  14. [14]

    Rishabh Kaushal, Jacob van de Kerkhof, Catalina Goanta, Gerasimos Spanakis, and Adriana Iamnitchi. 2024. Automated Transparency: A Legal and Empirical Analysis of the Digital Services Act Transparency Database. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 1121–1132. doi:10.1145/3630106.3658960

  15. [15]

    Jean-Baptiste Lamy, Abdelmalek Mouazer, Romain Léguillon, Romain Lelong, Stéfan J Darmoni, Karima Sedki, Sophie Dubois, and Hector Falcoff. 2024. Adap- tive questionnaires for facilitating patient data entry in clinical decision support systems: methods and application to STOPP/START v2.BMC Medical Informatics and Decision Making24, 1 (2024), 326. doi:10....

  16. [16]

    Richard Landis and Gary G

    J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agree- ment for categorical data.Biometrics33, 1 (1977), 159–174

  17. [17]

    Amanda Lazar, Caroline Edasis, and Anne Marie Piper. 2017. A critical lens on dementia and design in HCI. InProceedings of the CHI Conference on Human Factors in Computing Systems. 2175–2188. doi:10.1145/3025453.3025638

  18. [18]

    Hao Li, Shuai Wu, Haoran Zheng, Xiaobo Jiang, Bo Jiang, and Chao Zhao. 2024. LLMs-as-judges: A comprehensive survey on LLM-based evaluation methods. arXiv preprint arXiv:2412.05579(2024)

  19. [19]

    Adekeye, Daniel Berish, Feng Yuan, and Xiaopeng Zhao

    Yu-Ju Liao, Yu-Ling Jao, Marie Boltz, Olusegun T. Adekeye, Daniel Berish, Feng Yuan, and Xiaopeng Zhao. 2023. Use of a humanoid robot in supporting dementia care: A qualitative analysis.SAGE Open Nursing9 (2023), 23779608231179528. doi:10.1177/23779608231179528

  20. [20]

    Sonia Livingstone and Ellen J. Helsper. 2008. Parental mediation of children’s internet use.Journal of Broadcasting & Electronic Media52, 4 (2008), 581–599. doi:10.1080/08838150802437396

  21. [21]

    Joshua Maynez, Shashi Narayan, Bernd Bohnet, and Ryan McDonald. 2020. On faithfulness and factuality in abstractive summarization. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1906–1919

  22. [22]

    Junsoo Park, Seungyeon Jwa, Meiying Ren, Daeyoung Kim, and Sanghyuk Choi

  23. [23]

    Offsetbias: Leveraging debiased data for tuning evaluators.arXiv preprint arXiv:2407.06551(2024)

  24. [24]

    Xubin Ren, Lingrui Xu, Long Xia, Shuaiqiang Wang, Dawei Yin, and Chao Huang

  25. [25]

    Videorag: Retrieval-augmented gen- eration with extreme long-context videos.arXiv preprint arXiv:2502.01549, 2025

    VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos. arXiv:2502.01549 [cs.IR] https://arxiv.org/abs/2502.01549

  26. [26]

    Anna Riedmann, Philipp Schaper, and Birgit Lugrin. 2025. Reinforcement learning in education: A systematic literature review.International Journal of Artificial Intelligence in Education35 (2025), 1–65. doi:10.1007/s40593-025-00494-6

  27. [27]

    Landon Ring, Liyan Shi, Kayla Totzke, and Timothy Bickmore. 2015. Social support agents for older adults: Longitudinal affective computing in the home. In Proceedings of the International Conference on Affective Computing and Intelligent Interaction. 551–557. doi:10.1109/ACII.2015.7344662

  28. [28]

    Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid

  29. [29]

    InProceedings of the IEEE/CVF International Conference on Computer Vision

    VideoBERT: A joint model for video and language representation learning. InProceedings of the IEEE/CVF International Conference on Computer Vision. 7463–

  30. [30]

    doi:10.1109/ICCV.2019.00757

  31. [31]

    Zhulin Tao, Xiaohao Liu, Yewei Xia, Xiang Wang, Lifang Yang, Xianglin Huang, and Tat-Seng Chua. 2023. Self-supervised learning for multimedia recommenda- tion.IEEE Transactions on Multimedia25 (2023), 5107–5116. doi:10.1109/TMM. 2022.3177882

  32. [32]

    Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kul- shreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vince...

  33. [33]

    Jun Wang and Ying Zhao. 2022. Affective video content analysis and recommen- dation: A survey.IEEE Access10 (2022), 126430–126447. doi:10.1109/ACCESS. 2022.3195050

  34. [34]

    Qifan Wang, Yinwei Wei, Jianhua Yin, Jianwei Wu, Xuemeng Song, and Liqiang Nie. 2023. DualGNN: Dual graph neural network for multimedia recommendation. IEEE Transactions on Multimedia25 (2023), 1074–1084. doi:10.1109/TMM.2021. 3138298

  35. [35]

    Feng Yuan, Rui Zhang, Dania Bilal, and Xiaopeng Zhao. 2021. Learning-based strategy design for robot-assisted reminiscence therapy based on a developed model for people with dementia. InProceedings of the International Conference on Social Robotics. 432–442. doi:10.1007/978-3-030-85717-1_42

  36. [36]

    Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck

  37. [37]

    Dirty clicks: A study of the usability and security implications of click-related behaviors on the web

    Generating clarifying questions for information retrieval. InProceedings of The Web Conference 2020. ACM, 418–428. doi:10.1145/3366423.3380126 Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Zhao et al

  38. [38]

    Yongfeng Zhang and Xu Chen. 2020. Explainable recommendation: A survey and new perspectives.Foundations and Trends in Information Retrieval14, 1 (2020), 1–101. doi:10.1561/1500000071

  39. [39]

    Wenzheng Zhao. 2026. An Edge–Host–Cloud Architecture for Robot-Agnostic, Caregiver-in-the-Loop Personalized Cognitive Exercise: Multi-Site Deployment in Dementia Care.IEEE Transactions on Robotics (T-RO)(2026)

  40. [40]

    based on the specific clinical scenario

    Wenzheng Zhao, Kruthika Gangaraju, and Fengpei Yuan. 2025. Multimodal Perception-Driven Decision-Making for Human-Robot Interaction: A Survey. Frontiers in Robotics and AI12 (2025), 1604472. A Implementation and Execution Protocol SafeScreen operates across multiple environments: GPT-4 API for profile extraction, risk detection, and question generation; N...

  41. [41]

    car videos

    avoids accuracy thresholds, acknowledging metrics vary by content type and harm severity; for vulnerable populations, false negatives (showing harmful content) carry greater risk than false positives (over-cautious rejection). B.2 Hybrid AI-Human Evaluation Approach Following validation methodologies for LLM-as-a-judge frame- works [17, 21], we employ hyb...