arxiv: 2605.10404 · v1 · submitted 2026-05-11 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Position: Life-Logging Video Streams Make the Privacy-Utility Trade-off Inevitable

Liang Yue, Sijie Cheng, Tianyuan Zou, Yang Liu, Ya-Qin Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:00 UTC · model grok-4.3

classification 💻 cs.CV

keywords life-logging videoprivacy-utility trade-offalways-on AIsmart glassespipeline-aware privacydata exploitation pipelinevisual sensingproactive agents

0 comments

The pith

Life-logging video streams create an unavoidable privacy-utility trade-off for next-generation AI systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Always-on cameras in devices like smart glasses are turning continuous video recording into a standard feature that will support AI systems that perceive and respond to the world in real time. These streams reveal detailed personal information such as habits, emotions, and social connections that single images do not capture. Existing privacy tools either address only specific attacks or reduce the data's usefulness significantly, and they overlook how information moves through the full chain from capture to model training and inference. The paper concludes that this privacy-utility trade-off is therefore a core, unresolved problem for always-on AI and requires entirely new approaches that protect privacy across the entire data pipeline while preserving long-term utility. It also highlights the absence of standard ways to measure leaks and compare solutions as a barrier to progress.

Core claim

Life-logging video streams from pervasive always-on hardware form the backbone of next-generation AI systems that continuously perceive and react to the physical world. These streams expose sensitive information including behavioral patterns, emotional states, and social interactions beyond what isolated images reveal. Existing privacy protections are either attack-specific or incur substantial utility loss, and fail to consider the entire data exploitation pipeline. The authors therefore posit that the privacy-utility trade-off in life-logging video streams is a foundational challenge for next-generation AI systems that demands further investigation, and they call for novel pipeline-aware隐私

What carries the argument

The full data exploitation pipeline, from capture through processing, storage, and use in AI models, which current privacy methods do not address as a whole.

If this is right

Next-generation always-on AI systems will face reduced public trust and slower adoption unless the trade-off is resolved.
Privacy designs must jointly optimize utility and privacy across the entire long-horizon data pipeline rather than at isolated stages.
Formal metrics for quantifying privacy leakage in video streams are needed to guide development.
Standardized benchmarks for life-logging visual data will be required to compare new pipeline-aware methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Device makers may default to on-device processing only, limiting cloud-based world models and proactive agents.
Regulatory standards could emerge that restrict continuous visual sensing in consumer products until better protections exist.
New research may focus on semantic compression techniques that discard identifying details while retaining task-relevant information across time.

Load-bearing premise

Existing privacy protections cannot be extended or combined to handle continuous life-logging video without either leaving major attack vectors open or causing large drops in data utility.

What would settle it

A concrete pipeline-aware privacy method applied to real life-logging video data that maintains high downstream AI task performance while resisting a broad set of known and future attacks on the full pipeline.

read the original abstract

With the growing prevalence of always-on hardware such as smart glasses, body cameras, and home security systems, life-logging visual sensing is becoming inevitable, forming the backbone of persistent, always-on AI systems. Meanwhile, recent advances in proactive agents and world models signal a fundamental shift from episodic, prompt-driven tools to next-generation AI systems that continuously perceive and react to the physical world. Although life-logging video streams can substantially improve utility of these promising systems, they also introduce significant privacy risks by revealing sensitive information, such as behavioral patterns, emotional states, and social interactions, beyond what isolated images expose. If unresolved, these risks may undermine public trust and hinder the sustainable development of always-on AI technologies. Existing privacy protections are either attack-specific or incur substantial utility loss, and fail to consider the entire data exploitation pipeline. We therefore posit that the privacy-utility trade-off in life-logging video streams is a foundational challenge for next-generation AI systems that demands further investigation. We call for novel pipeline-aware privacy-preserving designs that jointly optimize utility and privacy for long-horizon life-logging visual data. In parallel, formal privacy leakage metrics and standardized benchmarks remain important open directions for future research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This position paper flags the privacy-utility tradeoff as inevitable for life-logging video in always-on AI systems but offers only assertions rather than analysis or evidence.

read the letter

The core takeaway is that life-logging video from smart glasses and similar devices will force a hard privacy-utility tradeoff in proactive AI agents and world models, and that existing defenses fall short because they ignore the full data pipeline. The authors want more work on joint optimization and better leakage metrics for long-horizon streams. That framing is timely given the hardware trends they mention. The paper does a reasonable job spelling out why continuous video reveals more sensitive patterns than isolated frames and why attack-specific tools may not scale. The call for pipeline-aware designs and standardized benchmarks is a straightforward research direction that follows from the premise. Beyond that, there is little new. The claim of inevitability is stated without data, without a survey of current video privacy methods, and without even a sketch of why no future approach could close the gap. No examples of failed defenses or quantitative utility losses appear, so the position rests on general risk statements. Readers already working on privacy in wearable vision or AI ethics might use this as a quick prompt for discussion or to frame a grant proposal. Most computer vision or machine learning researchers will find it too light on technical content to engage deeply. The paper shows clear thinking about the application domain and honest engagement with the literature on privacy tradeoffs, even though it adds no formal result. It deserves a serious referee if the venue accepts position pieces, mainly to test whether the community sees the same gap and wants to fill it. I would send it out for review rather than desk reject, with the expectation that reviewers will ask for more concrete grounding.

Referee Report

1 major / 1 minor

Summary. The manuscript is a position paper arguing that always-on life-logging video streams from devices such as smart glasses, body cameras, and home security systems create substantial privacy risks (revealing behavioral patterns, emotional states, and social interactions) for next-generation AI systems based on continuous perception and world models. It asserts that existing privacy protections are attack-specific or cause substantial utility loss and ignore the full data exploitation pipeline, leading to the claim that the privacy-utility trade-off is inevitable and foundational. The paper calls for pipeline-aware privacy-preserving designs that jointly optimize utility and privacy for long-horizon visual data, plus formal leakage metrics and standardized benchmarks.

Significance. If the position holds, it identifies a timely barrier to sustainable always-on AI and could usefully steer the community toward holistic, pipeline-aware privacy methods rather than piecemeal defenses. The manuscript correctly notes the shift from episodic to persistent visual sensing and the distinctive risks of video streams over isolated images. It also usefully flags the need for standardized benchmarks as a concrete open direction.

major comments (1)

[Abstract] Abstract: the assertion that 'Existing privacy protections are either attack-specific or incur substantial utility loss, and fail to consider the entire data exploitation pipeline' is load-bearing for the inevitability claim yet is presented as a general premise without citations, concrete examples of overlooked pipeline stages, or discussion of why attack-specific methods cannot be composed into pipeline-aware solutions.

minor comments (1)

The title's use of 'Inevitable' is a strong framing; the body should explicitly define what 'inevitable' means (e.g., without new research directions) to prevent misinterpretation as an absolute rather than a current-state observation.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review, positive assessment of the position paper's timeliness, and recommendation for minor revision. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that 'Existing privacy protections are either attack-specific or incur substantial utility loss, and fail to consider the entire data exploitation pipeline' is load-bearing for the inevitability claim yet is presented as a general premise without citations, concrete examples of overlooked pipeline stages, or discussion of why attack-specific methods cannot be composed into pipeline-aware solutions.

Authors: We agree that this premise is central to the inevitability claim and would benefit from explicit grounding. In the revised version we will (1) add a short clause in the abstract referencing the pipeline limitation, (2) insert a new paragraph early in the introduction that supplies concrete examples of attack-specific techniques (e.g., frame-level adversarial perturbations against attribute inference, differential privacy applied only at capture, or model-level defenses against membership inference), and (3) explain why such methods do not compose into pipeline-aware solutions: each targets an isolated stage and therefore leaves downstream cumulative leakage (long-horizon behavioral pattern extraction across continuous streams and world-model training) unaddressed. Relevant citations to the visual-privacy literature will be included. These additions strengthen the position without changing its core argument. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

This is a position paper whose central claim is a call to treat the privacy-utility trade-off as foundational for next-generation AI and to pursue pipeline-aware designs. It advances no formal theorem, derivation, equations, fitted parameters, or quantitative predictions. The supporting premise about existing defenses is presented as motivation rather than a demonstrated result via self-referential construction or citation chain. No load-bearing step reduces to its own inputs by definition or self-citation, so the paper is self-contained as a non-technical advocacy piece.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is a position statement and introduces no free parameters, axioms, or invented entities in a technical sense.

pith-pipeline@v0.9.0 · 5521 in / 954 out tokens · 72334 ms · 2026-05-12T04:00:23.470685+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
We therefore posit that the privacy-utility trade-off in life-logging video streams is a foundational challenge

Reference graph

Works this paper leans on

121 extracted references · 121 canonical work pages · 8 internal anchors

[1]

OpenClaw — Personal AI Assistant,

OpenClaw, “OpenClaw — Personal AI Assistant,”https://openclaw.ai/, 2026. Official website. Accessed: 2026-04-22

work page 2026
[2]

Proactive Conversational AI: A Comprehensive Survey of Advancements and Opportunities,

Deng, Y., Liao, L., Lei, W., Yang, G. H., Lam, W., and Chua, T.-S., “Proactive Conversational AI: A Comprehensive Survey of Advancements and Opportunities,”ACM Transactions on Information Systems, Vol. 43, No. 3, 2025, pp. 1–45

work page 2025
[3]

Towards Human-centered Proactive Conversational Agents,

Deng, Y., Liao, L., Zheng, Z., Yang, G. H., and Chua, T.-S., “Towards Human-centered Proactive Conversational Agents,” Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024, pp. 807–818

work page 2024
[4]

Ask-before-Plan: Proactive Language Agents for Real-World Planning,

Zhang, X., Deng, Y., Ren, Z., Ng, S. K., and Chua, T.-S., “Ask-before-Plan: Proactive Language Agents for Real-World Planning,”Findings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 10836–10863

work page 2024
[5]

DINOv3

Siméoni, O., Vo, H. V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khalidov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., et al., “DINOv3,”arXiv preprint arXiv:2508.10104, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[6]

SAM 3: Segment Anything with Concepts

Carion, N., Gustafson, L., Hu, Y.-T., Debnath, S., Hu, R., Suris, D., Ryali, C., Alwala, K. V., Khedr, H., Huang, A., et al., “SAM 3: Segment Anything with Concepts,”arXiv preprint arXiv:2511.16719, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[7]

Perception Encoder: The best visual embeddings are not at the output of the network

Bolya, D., Huang, P.-Y., Sun, P., Cho, J. H., Madotto, A., Wei, C., Ma, T., Zhi, J., Rajasegaran, J., Rasheed, H., et al., “PerceptionEncoder: Thebestvisualembeddingsarenotattheoutputofthenetwork,”arXivpreprintarXiv:2504.13181,2025

work page internal anchor Pith review arXiv 2025
[8]

Self-supervised learning from images with a joint-embedding predictive architecture,

Assran, M., Duval, Q., Misra, I., Bojanowski, P., Vincent, P., Rabbat, M., LeCun, Y., and Ballas, N., “Self-supervised learning from images with a joint-embedding predictive architecture,”Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 15619–15629

work page 2023
[9]

Genie 3: A New Frontier for World Models,

Google DeepMind, “Genie 3: A New Frontier for World Models,”https://deepmind.google/blog/genie-3-a-new- frontier-for-world-models/, Aug. 2025. Google DeepMind Blog. Accessed: 2026-04-22

work page 2025
[10]

Ray-Ban Meta AI Glasses,

Meta, “Ray-Ban Meta AI Glasses,”https://www.meta.com/ai-glasses/ray-ban-meta/, 2026. Accessed: 2026-04-21

work page 2026
[11]

RayNeo AR Smart Glasses | Official Website,

RayNeo, “RayNeo AR Smart Glasses | Official Website,”https://www.rayneo.com/, 2026. Accessed: 2026-04-21

work page 2026
[12]

ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild

Yang, B., Xu, L., Zeng, L., Guo, Y., Jiang, S., Lu, W., Liu, K., Xiang, H., Jiang, X., Xing, G., et al., “ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems,”arXiv preprint arXiv:2512.06721, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[13]

A New Look at How Android XR Will Bring Gemini to Glasses and Headsets,

Google, “A New Look at How Android XR Will Bring Gemini to Glasses and Headsets,”https://blog.google/ products-and-platforms/platforms/android/android-xr-gemini-glasses-headsets/ , May 2025. Google Official Blog. Accessed: 2026-04-21

work page 2025
[14]

HeyCyan – Smart Glasses Companion App,

HeyCyan, “HeyCyan – Smart Glasses Companion App,”https://heycyan.net/, 2026. Official website. Accessed: 2026-04-29. 9

work page 2026
[15]

VisionClaw: Always-On AI Agents through Smart Glasses

Liu, X., Lee, D., Gonzalez, E. J., Gonzalez-Franco, M., and Suzuki, R., “VisionClaw: Always-On AI Agents through Smart Glasses,”arXiv preprint arXiv:2604.03486, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

Multi-step or Direct: A Proactive Home-Assistant System Based on Commonsense Reasoning,

Yamasaki, K., Tanaka, S., Yuguchi, A., Kawano, S., and Yoshino, K., “Multi-step or Direct: A Proactive Home-Assistant System Based on Commonsense Reasoning,”Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2025, pp. 561–572

work page 2025
[17]

From Image to Video: An Empirical Study of Diffusion Representations,

Vélez, P., Polanía, L. F., Yang, Y., Zhang, C., Kabra, R., Arnab, A., and Sajjadi, M. S., “From Image to Video: An Empirical Study of Diffusion Representations,”Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 16948–16958

work page 2025
[18]

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents,

Zhou, Z., Qu, A., Wu, Z., Kim, S., Prakash, A., Rus, D., Zhao, J., Low, B. K. H., and Liang, P. P., “MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents,”First Workshop on Multi-Turn Interactions in Large Language Models, 2026

work page 2026
[19]

TAMEing Long Contexts in Personalization: Towards Training-Free andState-AwareMLLMPersonalizedAssistant,

Hong, R., Lang, J., Zhong, T., Wang, Y., and Zhou, F., “TAMEing Long Contexts in Personalization: Towards Training-Free andState-AwareMLLMPersonalizedAssistant,”Proceedingsofthe32ndACMSIGKDDConferenceonKnowledgeDiscovery and Data Mining V. 1, 2026, pp. 452–463

work page 2026
[20]

Memex(rl): Scaling long-horizon llm agents via indexed experience memory,

Wang, Z., Chen, H., Wang, J., and Wei, W., “Memex (RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory,”arXiv preprint arXiv:2603.04257, 2026

work page arXiv 2026
[21]

Vivint Smart Home Security Systems,

Vivint, “Vivint Smart Home Security Systems,”https://www.vivint.com/, 2026. Official website. Accessed: 2026-04-29

work page 2026
[22]

SimpliSafe Home Security Systems,

SimpliSafe, “SimpliSafe Home Security Systems,”https://simplisafe.com/value, 2026. Official promotional webpage. Accessed: 2026-04-29

work page 2026
[23]

AI Security Guard,

Spot AI, “AI Security Guard,”https://www.spot.ai/ai-security-guard, 2026. Official product page. Accessed: 2026-04-29

work page 2026
[24]

Explosion Protected StreamCam,

EarthCam, “Explosion Protected StreamCam,”https://www.earthcam.net/products/explosionprotectedstreamcam. php, 2026. Official product page. Accessed: 2026-04-29

work page 2026
[25]

Explosion Protected StreamCam Robotic,

EarthCam, “Explosion Protected StreamCam Robotic,” https://www.earthcam.net/products/ explosionprotectedstreamcamrobotic.php, 2026. Official product page. Accessed: 2026-04-29

work page 2026
[26]

4 Channel Dash Cam Collection,

IIWEY, “4 Channel Dash Cam Collection,”https://iiwey.com/collections/4-channel-dash-cam, 2026. Official product collection page. Accessed: 2026-04-29

work page 2026
[27]

Neideso Official Website,

Neideso, “Neideso Official Website,”https://www.neideso.cn/, 2026. Official website. Accessed: 2026-04-29

work page 2026
[28]

DJI Nano,

DJI, “DJI Nano,”https://www.dji.com/nano, 2026. Official product page. Accessed: 2026-04-28

work page 2026
[29]

Looki L1,

Looki, “Looki L1,”https://www.looki.ai/products/looki-l1, 2026. Official product page. Accessed: 2026-04-28

work page 2026
[30]

Body Cameras: The Complete Guide for Law Enforcement Professionals,

Axon, “Body Cameras: The Complete Guide for Law Enforcement Professionals,”https://www.axon.com/resources/ body-cameras-complete-guide, 2026. Axon Resources. Accessed: 2026-04-28

work page 2026
[31]

AI Flow at the Network Edge,

Shao, J., and Li, X., “AI Flow at the Network Edge,”IEEE Network, 2025

work page 2025
[32]

Deep Face Recognition: A Survey,

Wang, M., and Deng, W., “Deep Face Recognition: A Survey,”Neurocomputing, Vol. 429, 2021, pp. 215–244

work page 2021
[33]

PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition,

Dhar, P., Gleason, J., Roy, A., Castillo, C. D., and Chellappa, R., “PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition,”Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15087–15096

work page 2021
[34]

Jain,L.C.,Halici,U.,Hayashi,I.,Lee,S.,andTsutsui,S.,IntelligentBiometricTechniquesinFingerprintandFaceRecognition, Routledge, 2022

work page 2022
[35]

The Pervasive Blind Spot: Benchmarking VLM Inference Risks on Everyday Personal Videos,

Zhang, S., Li, Z., Wen, C., Ma, Y., Li, S., Zhang, G., Zhang, Z., Meng, Y., Zhao, H., Yi, X., et al., “The Pervasive Blind Spot: Benchmarking VLM Inference Risks on Everyday Personal Videos,”arXiv preprint arXiv:2511.02367, 2025

work page arXiv 2025
[36]

Collaborative Spatiotemporal Feature Learning for Video Action Recognition,

Li, C., Zhong, Q., Xie, D., and Pu, S., “Collaborative Spatiotemporal Feature Learning for Video Action Recognition,” Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2019, pp. 7872–7881. 10

work page 2019
[37]

Actor Conditioned Attention Maps for Video Action Detection,

Ulutan, O., Rallapalli, S., Srivatsa, M., Torres, C., and Manjunath, B., “Actor Conditioned Attention Maps for Video Action Detection,”Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 527–536

work page 2020
[38]

Frame-by-Frame: Tracking Emotions in Videos with AI,

Legara, J. S., “Frame-by-Frame: Tracking Emotions in Videos with AI,”Medium, 2023. URLhttps://medium.com/ @johnsolomonlegara/frame-by-frame-tracking-emotions-in-videos-with-ai-ee31a1a05ab6 , accessed: 2026-04- 03

work page 2023
[39]

Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 (General Data Protection Regulation),

European Parliament and Council of the European Union, “Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 (General Data Protection Regulation),”https://eur-lex.europa.eu/eli/reg/2016/679/oj,

work page 2016
[40]

Official Journal of the European Union, L119, pp. 1–88

work page
[41]

California Consumer Privacy Act (CCPA),

California State Legislature, “California Consumer Privacy Act (CCPA),”https://cppa.ca.gov/regulations/pdf/ccpa_ statute.pdf, 2018. As amended, effective January 1, 2020

work page 2018
[42]

Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule,

U.S. Department of Health & Human Services, “Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule,” https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html, 1996. Accessed 2026

work page 1996
[43]

Live Face De-Identification in Video,

Gafni, O., Wolf, L., and Taigman, Y., “Live Face De-Identification in Video,”Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9378–9387

work page 2019
[44]

Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach,

Huang, W., Ni, Y., Dehaghani, A. R., Jeong, S. E., Chen, H., Liu, Y., Wen, F., and Imani, M., “Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach,”Proceedings of the Winter Conference on Applications of Computer Vision, 2025, pp. 5239–5249

work page 2025
[45]

Audio-Visual Autoencoding for Privacy-Preserving Video Streaming,

Xu, H., Cai, Z., Takabi, D., and Li, W., “Audio-Visual Autoencoding for Privacy-Preserving Video Streaming,”IEEE Internet of Things Journal, Vol. 9, No. 3, 2021, pp. 1749–1761

work page 2021
[46]

Preserving Privacy and Video Quality Through Remote Physiological Signal Removal,

Bhutani, S., Elgendi, M., and Menon, C., “Preserving Privacy and Video Quality Through Remote Physiological Signal Removal,”Communications Engineering, Vol. 4, No. 1, 2025, p. 66

work page 2025
[47]

Privacy-ProtectedSleepStagingUsingBlurredVideos,

Wang,Q.,Xia,M.,Zhu,Y.,Cheng,H.,andWang,W.,“Privacy-ProtectedSleepStagingUsingBlurredVideos,”IEEEJournal of Biomedical and Health Informatics, Vol. 29, No. 12, 2025, pp. 8839–8846

work page 2025
[48]

Protecting Visual Secrets Using Adversarial Nets,

Machanavajjhala, A., Landon Cox, N. P., et al., “Protecting Visual Secrets Using Adversarial Nets,”Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 25–28

work page 2017
[49]

I Know That Person: Generative Full Body and Face De-Identification of People in Images,

Brkic, K., Sikiric, I., Hrkac, T., and Kalafatic, Z., “I Know That Person: Generative Full Body and Face De-Identification of People in Images,”2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), IEEE, 2017, pp. 1319–1328

work page 2017
[50]

Privacy-Preserving Video Analytics Through GAN-Based Face De-Identification,

More, R., Maity, A., Kambli, G., and Ambadekar, S., “Privacy-Preserving Video Analytics Through GAN-Based Face De-Identification,”2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), IEEE, 2024, pp. 1–6

work page 2024
[51]

CartoonGAN: Generative Adversarial Networks for Photo Cartoonizatio,

Chen, Y., Lai, Y.-K., and Liu, Y.-J., “CartoonGAN: Generative Adversarial Networks for Photo Cartoonizatio,”Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9465–9474

work page 2018
[52]

ASteganographyAlgorithmBasedonCycleGANforCovertCommunication in the Internet of Things,

Meng,R.,Cui,Q.,Zhou,Z.,Fu,Z.,andSun,X.,“ASteganographyAlgorithmBasedonCycleGANforCovertCommunication in the Internet of Things,”IEEE Access, Vol. 7, 2019, pp. 90574–90584

work page 2019
[53]

Differential Privacy,

Dwork, C., “Differential Privacy,”International colloquium on automata, languages, and programming, Springer, 2006, pp. 1–12

work page 2006
[54]

Communication-efficient learning of deep networks from decentralized data,

McMahan,H.B.,Moore,E.,Ramage,D.,andyArcas,B.A.,“FederatedLearningofDeepNetworksUsingModelAveraging,” arXiv preprint arXiv:1602.05629, Vol. 2, 2016

work page arXiv 2016
[55]

More effort is needed to protect pedestrian privacy in the era of AI,

Zhang, X., and Zhao, Z., “More effort is needed to protect pedestrian privacy in the era of AI,”The Thirty-Ninth Annual Conference on Neural Information Processing Systems Position Paper Track, 2025

work page 2025
[56]

Side-Channel Information Leakage of Encrypted Video Stream in Video SurveillanceSystems,

Li, H., He, Y., Sun, L., Cheng, X., and Yu, J., “Side-Channel Information Leakage of Encrypted Video Stream in Video SurveillanceSystems,”IEEEINFOCOM2016-The35thAnnualIEEEInternationalConferenceonComputerCommunications, IEEE, 2016, pp. 1–9

work page 2016
[57]

Deep Residual Learning for Image Recognition,

He, K., Zhang, X., Ren, S., and Sun, J., “Deep Residual Learning for Image Recognition,”Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. 11

work page 2016
[58]

Learning Transferable Visual Models from Natural Language Supervision,

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al., “Learning Transferable Visual Models from Natural Language Supervision,”International conference on machine learning, PmLR, 2021, pp. 8748–8763

work page 2021
[59]

CLID-ReID: Exploiting Vision-Language Model for Image re-identification without Concrete Text Labels,

Li, S., Sun, L., and Li, Q., “CLID-ReID: Exploiting Vision-Language Model for Image re-identification without Concrete Text Labels,”Proceedings of the AAAI conference on artificial intelligence, Vol. 37, 2023, pp. 1405–1413

work page 2023
[60]

Person Re-identification: Past, Present and Future,

Zheng, L., Yang, Y., and Hauptmann, A. G., “Person Re-identification: Past, Present and Future,”arXiv preprint arXiv:1610.02984, 2016

work page arXiv 2016
[61]

A System Identification Approach for Video-based Face Recognition

Aggarwal, G., Roy-Chowdhury, A. K., and Chellappa, R., “A System Identification Approach for Video-based Face Recognition.”ICPR (4), 2004, pp. 175–178

work page 2004
[62]

Video Person Re-ID: Fantastic Techniques and Where to Find Them (Student Abstract),

Pathak, P., Eshratifar, A. E., and Gormish, M., “Video Person Re-ID: Fantastic Techniques and Where to Find Them (Student Abstract),”Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 2020, pp. 13893–13894

work page 2020
[63]

K., and Prabhakar, S.,Handbook of Fingerprint Recognition, Springer, 2009

Maltoni, D., Maio, D., Jain, A. K., and Prabhakar, S.,Handbook of Fingerprint Recognition, Springer, 2009

work page 2009
[64]

A Review on Iris Recognition,

Kaur, N., and Juneja, M., “A Review on Iris Recognition,”2014 Recent Advances in Engineering and Computational Sciences (RAECS), 2014, pp. 1–5

work page 2014
[65]

Biometric Recognition: Challenges and Opportunities,

Millett, L. I., and Pato, J. N., “Biometric Recognition: Challenges and Opportunities,” 2010

work page 2010
[66]

Gait Recognition in the Wild: A Benchmark,

Zhu, Z., Guo, X., Yang, T., Huang, J., Deng, J., Huang, G., Du, D., Lu, J., and Zhou, J., “Gait Recognition in the Wild: A Benchmark,”Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 14789–14799

work page 2021
[67]

Are You Really Looking at Me? A Feature-Extraction Framework for Estimating Interpersonal Eye Gaze from Conventional Video,

Tran, M., Sen, T., Haut, K., Ali, M. R., and Hoque, E., “Are You Really Looking at Me? A Feature-Extraction Framework for Estimating Interpersonal Eye Gaze from Conventional Video,”IEEE Transactions on Affective Computing, Vol. 13, No. 2, 2020, pp. 912–925

work page 2020
[68]

AI vs. Humans: Comparing Road User Intention Recognition Performance,

Vellenga, K., Steinhauer, H. J., Falkman, G., Andersson, J., and Sjögren, A., “AI vs. Humans: Comparing Road User Intention Recognition Performance,”Transportation Research Part F: Traffic Psychology and Behaviour, Vol. 118, 2026, p. 103491

work page 2026
[69]

A Hybrid Algorithm for Human Interaction Recognition from Drone Videos: Experimental Analysis to Enhance Disaster Response and Rescue,

Wang, X., Pirasteh, S., Varshosaz, M., and Fang, Z., “A Hybrid Algorithm for Human Interaction Recognition from Drone Videos: Experimental Analysis to Enhance Disaster Response and Rescue,”Geomatics, Natural Hazards and Risk, Vol. 17, No. 1, 2026, p. 2621550

work page 2026
[70]

Spatiotemporal VideoEncodersandZero-shotSegmentationfor3DActionRecognitionandBehaviorAnalysisofBroilerChickensAssociated with Different Welfare Indicators and Body Weight,

Asali, E., Li, G., Saeidifar, M., Liu, T., Oso, O. M., Mandiga, A., Bodempudi, V. U. C., and Kota, S. A. R., “Spatiotemporal VideoEncodersandZero-shotSegmentationfor3DActionRecognitionandBehaviorAnalysisofBroilerChickensAssociated with Different Welfare Indicators and Body Weight,”Computers and Electronics in Agriculture, Vol. 241, 2026, p. 111305

work page 2026
[71]

Multi-Modal Multi-Ation Video Recognition,

Shi, Z., Liang, J., Li, Q., Zheng, H., Gu, Z., Dong, J., and Zheng, B., “Multi-Modal Multi-Ation Video Recognition,” Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 13678–13687

work page 2021
[72]

Analyze Emotions in Your Videos,

MorphCast, “Analyze Emotions in Your Videos,” , n.d. URLhttps://www.morphcast.com/experiments/demo-analyze- emotions-in-your-videos/, accessed: 2026-04-03

work page 2026
[73]

Video Emotion Recognition: Analyze Video Emotions and Personality with Multimodal Emotion Analysis,

Imentiv AI, “Video Emotion Recognition: Analyze Video Emotions and Personality with Multimodal Emotion Analysis,” , n.d. URLhttps://imentiv.ai/product-use-cases/video-emotion-recognition/, accessed: 2026-04-03

work page 2026
[74]

Multimodal Video Emotion Recognition with Reliable Reasoning Priors,

Wang, Z., Zhu, Y., Dong, G., Yi, H., Chen, F., Wang, X., and Xie, J., “Multimodal Video Emotion Recognition with Reliable Reasoning Priors,”arXiv preprint arXiv:2508.03722, 2025

work page arXiv 2025
[75]

Does Clip Know My Face?

Hintersdorf, D., Struppek, L., Brack, M., Friedrich, F., Schramowski, P., and Kersting, K., “Does Clip Know My Face?” Journal of Artificial Intelligence Research, Vol. 80, 2024, pp. 1033–1062

work page 2024
[76]

Membership Inference Attacks Against Large Vision-Language Models,

Li, Z., Wu, Y., Chen, Y., Tonin, F., Abad Rocamora, E., and Cevher, V., “Membership Inference Attacks Against Large Vision-Language Models,”Advances in Neural Information Processing Systems, Vol. 37, 2024, pp. 98645–98674

work page 2024
[77]

Membership Inference Attacks Against Vision-Language Models,

Hu, Y., Li, Z., Liu, Z., Zhang, Y., Qin, Z., Ren, K., and Chen, C., “Membership Inference Attacks Against Vision-Language Models,”34th USENIX Security Symposium (USENIX Security 25), 2025, pp. 1589–1608

work page 2025
[78]

GradViT: Gradient Inversion of Vision Transformers,

Hatamizadeh, A., Yin, H., Roth, H. R., Li, W., Kautz, J., Xu, D., and Molchanov, P., “GradViT: Gradient Inversion of Vision Transformers,”Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10021–10030. 12

work page 2022
[79]

GIFD:AGenerativeGradientInversionMethodwithFeatureDomain Optimization,

Fang,H.,Chen,B.,Wang,X.,Wang,Z.,andXia,S.-T.,“GIFD:AGenerativeGradientInversionMethodwithFeatureDomain Optimization,”Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4967–4976

work page 2023
[80]

PrivacyLeaksbyAdversaries: AdversarialIterationsforMembership Inference Attack,

Xue,J.,Sun,Z.,Ye,H.,Luo,L.,Chang,X.,andDai,G.,“PrivacyLeaksbyAdversaries: AdversarialIterationsforMembership Inference Attack,”Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 40, 2026, pp. 35967–35975

work page 2026

Showing first 80 references.