pith. machine review for the scientific record. sign in

arxiv: 2603.19791 · v2 · submitted 2026-03-20 · 💻 cs.CR

Recognition: unknown

Text-Based Personas for Simulating User Privacy Decisions

Kassem Fawaz , Ren Yi , Octavian Suciu , Rishabh Khandelwal , Hamza Harkous , Nina Taft , Marco Gruteser

Authors on Pith no claims yet
classification 💻 cs.CR
keywords privacypersonasdecisionsuserdatasetsnarrivaaccuracyacross
0
0 comments X
read the original abstract

The ability to simulate human privacy decisions has significant implications for aligning autonomous agents with individual intent and conducting cost-effective, large-scale privacy-centric user studies. Prior approaches prompt Large Language Models (LLMs) with natural language user statements, data-sharing histories, or demographic attributes to simulate privacy decisions. These approaches, however, fail to balance individual-level accuracy, human auditability, token efficiency, and population-level representation. We present Narriva, an approach that generates text-based synthetic privacy personas to address these shortcomings. Narriva grounds persona generation in prior user privacy decisions, such as those from large-scale survey datasets, rather than purely relying on demographic stereotypes. It compresses this data into concise, human-readable summaries structured by established privacy theories. Through benchmarking across five diverse datasets, we analyze the characteristics of Narriva's synthetic personas in modeling both individual and population-level privacy preferences. We find that grounding personas in past privacy behaviors achieves up to 87% predictive accuracy, improving over a non-personalized LLM baseline by 6-17 percentage points across datasets, while yielding an 80-95% reduction in prompt tokens compared to in-context learning with raw examples. Finally, we demonstrate that personas synthesized from a single survey can reproduce the aggregate privacy behaviors and statistical distributions of entirely different studies.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

    cs.CR 2026-05 unverdicted novelty 6.0

    PrivacySIM shows that conditioning LLMs on user personas like demographics and attitudes improves simulation of privacy choices but reaches only 40.4% accuracy against real responses from 1,000 users.

  2. An AI Agent Execution Environment to Safeguard User Data

    cs.CR 2026-04 unverdicted novelty 6.0

    GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack...