pith. sign in

arxiv: 2606.06784 · v1 · pith:DKFLCWDKnew · submitted 2026-06-05 · 💻 cs.CR · cs.AI· cs.CY

What Your Posts Reveal: A Benchmark and Agentic Framework for User-Level Privacy Leakage on Social Media

Pith reviewed 2026-06-27 22:06 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CY
keywords privacy leakagesocial mediabenchmarkagentic frameworkmultimodal inferenceprivacy exposure scorecumulative leakageabductive reasoning
0
0 comments X

The pith

Argus aggregates weak cues across social media posts into privacy profiles via hypothesis formation and verification, scoring 0.55 PES for a 25% gain over baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that private details such as home location or daily routines can leak from public posts when harmless cues in text, images, and metadata combine across multiple entries. It supplies SopriBench, a benchmark of 50 user profiles and 1,569 images drawn from observed leakage patterns, plus the PES metric that scores exposure by weighing detail level against contextual sensitivity. Argus, a training-free agentic system, builds and checks hypotheses from accumulating evidence then merges cross-post signals into unified profiles. This yields the reported 0.55 PES score and the largest lift on cases that require linking information from separate posts.

Core claim

Public social media posts can reveal private information through weak cues scattered across text, images, or metadata, with leakage that is often cumulative and cross-post. The authors create SopriBench, a synthetic benchmark guided by patterns from a private reference corpus of Rednote and Instagram accounts that covers 50 user profiles and 1,569 images annotated for attributes, sensitivity, granularity, leakage type, inference difficulty, and evidence. They define the Privacy Exposure Score to weight value granularity by contextual sensitivity. Inspired by abductive reasoning, Argus forms hypotheses from accumulated evidence, verifies supporting evidence, and aggregates cross-post cues int

What carries the argument

Argus, the training-free agentic framework that forms hypotheses from accumulated evidence, verifies supporting evidence, and aggregates cross-post cues into privacy profiles.

If this is right

  • Cross-post cue aggregation produces measurably higher privacy exposure scores than single-post analysis.
  • PES supplies a graded alternative to binary accuracy by incorporating both granularity and contextual sensitivity.
  • Synthetic benchmarks constructed from observed real-world patterns can serve as a standardized testbed for multimodal leakage methods.
  • Training-free agentic reasoning can outperform conventional baselines on tasks that require chaining evidence across posts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Platforms could integrate similar hypothesis-verification agents to surface cumulative exposure warnings to users before additional posts are made.
  • The identified leakage patterns suggest targeted user guidance on how combinations of posts interact rather than warnings about isolated items.
  • Extending the benchmark and framework to other messaging or photo-sharing services would test whether the same cross-post mechanisms dominate elsewhere.

Load-bearing premise

The leakage patterns abstracted from the private reference corpus of Rednote and Instagram accounts are representative of general user-level multimodal privacy leakage.

What would settle it

Running Argus on a fresh collection of real accounts drawn from platforms or user populations outside the original Rednote and Instagram reference corpus and finding no 25% PES gain or performance below the reported baseline would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2606.06784 by Aiwen Lu, Jiaheng Wei, Jingyi Zheng, Peixian Zhang, Qiming Ye, Xinlei He, Xuechao Wang, Yini Huang, Yule Liu, Zifan Peng.

Figure 1
Figure 1. Figure 1: Overview of SopriBench construction. A private real-user corpus is abstracted into de-identified leakage patterns, which guide [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of Argus. The system skims public posts [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of SingleAgent failure modes on a synthetic [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distribution of individual weighted accuracy scores [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
read the original abstract

Public social media posts can reveal private information through weak cues scattered across text, images, or metadata. Such leakage is often cumulative and cross-post: cues that appear harmless in isolation may jointly expose a user's home, workplace, or routine. However, current research lacks a unified benchmark for user-level multimodal privacy leakage and an evaluation metric that captures exposure severity beyond binary accuracy. To address these gaps, we propose SopriBench, a synthetic benchmark guided by leakage patterns abstracted from a private reference corpus of Rednote and Instagram accounts, covering 50 user profiles and 1,569 images with attributes, contextual sensitivity, granularity, leakage type, inference difficulty, and supporting evidence. We further introduce the Privacy Exposure Score (PES), which weights value granularity by contextual sensitivity. Inspired by abductive reasoning, we introduce Argus, a training-free agentic framework for cumulative leakage inference. Argus forms hypotheses from accumulated evidence, verifies supporting evidence, and aggregates cross-post cues into privacy profiles, achieving 0.55 PES, a 25% improvement over the strongest baseline, with the largest gain on cross-post leakage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that public social media posts can cumulatively leak private information via weak multimodal cues, addresses the lack of unified benchmarks and metrics by introducing SopriBench (a synthetic benchmark of 50 profiles and 1,569 images derived from leakage patterns in a private Rednote/Instagram corpus, annotated with attributes, sensitivity, granularity, leakage type, difficulty, and evidence) and the Privacy Exposure Score (PES) metric that weights granularity by contextual sensitivity, and presents Argus, a training-free abductive agentic framework that forms/verifies hypotheses from accumulated cross-post evidence to produce privacy profiles, reporting 0.55 PES (25% above the strongest baseline) with largest gains on cross-post leakage.

Significance. If the benchmark construction and evaluation hold, the work would supply a needed unified testbed and severity-aware metric for user-level multimodal privacy leakage research while showing that agentic abductive reasoning can outperform standard baselines on cumulative inference; the training-free design and focus on cross-post aggregation are practical strengths for privacy analysis.

major comments (2)
  1. [SopriBench construction] SopriBench construction (abstract and methods): the central performance claim (0.55 PES, +25% over baseline) rests on a synthetic benchmark generated by abstracting patterns from a non-public reference corpus, yet no section details the abstraction procedure, inter-annotator validation for the 50 profiles/1,569 images, or any quantitative check that the resulting attribute/sensitivity/granularity distributions match observable frequencies on public data; without this, the reported improvement risks being an artifact of seeding rather than evidence of generalization.
  2. [Evaluation and results] Evaluation and results: the 25% PES gain and 'largest gain on cross-post leakage' are stated without reported variance, number of runs, statistical significance tests, or ablation isolating the contribution of hypothesis verification versus simple aggregation; this makes it impossible to determine whether the improvement is robust or load-bearing for the agentic framework claim.
minor comments (2)
  1. [PES metric] PES definition: the weighting of granularity by contextual sensitivity is introduced but the exact formula, normalization, and handling of missing sensitivity labels are not shown in the abstract; a concrete equation or pseudocode would aid reproducibility.
  2. [Baselines] Baseline comparison: the 'strongest baseline' is not named or described in the provided abstract, preventing assessment of whether the comparison is fair (e.g., whether baselines also operate on the same multimodal, cross-post setting).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on SopriBench construction and evaluation. We address each major comment below and commit to revisions that strengthen the manuscript without misrepresenting the current work.

read point-by-point responses
  1. Referee: [SopriBench construction] SopriBench construction (abstract and methods): the central performance claim (0.55 PES, +25% over baseline) rests on a synthetic benchmark generated by abstracting patterns from a non-public reference corpus, yet no section details the abstraction procedure, inter-annotator validation for the 50 profiles/1,569 images, or any quantitative check that the resulting attribute/sensitivity/granularity distributions match observable frequencies on public data; without this, the reported improvement risks being an artifact of seeding rather than evidence of generalization.

    Authors: We agree the current manuscript lacks a dedicated methods subsection on the abstraction procedure, inter-annotator agreement statistics, and distribution comparisons. In revision we will add this material, describing the pattern identification steps from the private corpus, reporting Cohen's kappa or equivalent for the 50 profiles and 1,569 images, and providing any feasible quantitative alignment checks against publicly observable frequencies. Because the reference corpus is private, full replication data cannot be released, but the added section will maximize methodological transparency. revision: yes

  2. Referee: [Evaluation and results] Evaluation and results: the 25% PES gain and 'largest gain on cross-post leakage' are stated without reported variance, number of runs, statistical significance tests, or ablation isolating the contribution of hypothesis verification versus simple aggregation; this makes it impossible to determine whether the improvement is robust or load-bearing for the agentic framework claim.

    Authors: We acknowledge the absence of variance estimates, run counts, significance testing, and targeted ablations in the reported results. The revised manuscript will include multiple independent runs with standard deviations, paired statistical tests (e.g., Wilcoxon or t-tests) on the PES differences, and an ablation that isolates the hypothesis-verification component of Argus from baseline aggregation. These additions will directly address robustness and the contribution of the agentic design. revision: yes

Circularity Check

0 steps flagged

No significant circularity; new benchmark and agentic framework are independently evaluated

full rationale

The paper constructs SopriBench from leakage patterns in a private reference corpus and evaluates the training-free Argus framework on it, reporting an empirical PES of 0.55 with 25% improvement over baselines. No equations, derivations, or load-bearing steps reduce by construction to fitted parameters, self-definitions, or self-citation chains. The central claims rest on comparisons against baselines within the newly created benchmark rather than prior fitted quantities, satisfying the criteria for a self-contained evaluation with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 3 invented entities

Central contributions rest on a private reference corpus for pattern abstraction and on the new PES weighting scheme; no external benchmarks or machine-checked components are referenced.

invented entities (3)
  • SopriBench no independent evidence
    purpose: Synthetic benchmark dataset covering 50 profiles and 1,569 images with leakage attributes
    Constructed for this work from abstracted patterns of a private corpus
  • Argus no independent evidence
    purpose: Training-free agentic framework using abductive reasoning for cumulative inference
    Introduced in this paper as the primary inference method
  • Privacy Exposure Score (PES) no independent evidence
    purpose: Metric that weights value granularity by contextual sensitivity
    New evaluation metric proposed to replace binary accuracy

pith-pipeline@v0.9.1-grok · 5766 in / 1461 out tokens · 32391 ms · 2026-06-27T22:06:32.209542+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 3 linked inside Pith

  1. [1]

    No place to hide: Inadvertent location privacy leaks on twitter,

    J. Rusert, O. Khalid, D. Hong, Z. Shafiq, and P. Srini- vasan, “No place to hide: Inadvertent location privacy leaks on twitter,”Proc. Priv. Enhancing Technol., vol. 2019, no. 4, pp. 172–189, 2019. 1

  2. [2]

    Beware of what you share: Inferring home location in social networks,

    T. Pontes, G. Magno, M. A. Vasconcelos, A. Gupta, J. M. Almeida, P. Kumaraguru, and V . A. F. Almeida, “Beware of what you share: Inferring home location in social networks,” in12th IEEE International Conference on Data Mining Workshops, ICDM Workshops, Brussels, 7 Belgium, December 10, 2012. IEEE Computer Society, 2012, pp. 571–578. 1

  3. [3]

    Please forget where I was last summer: The privacy risks of public location (meta)data,

    K. Drakonakis, P. Ilia, S. Ioannidis, and J. Polakis, “Please forget where I was last summer: The privacy risks of public location (meta)data,” in26th Annual Network and Distributed System Security Symposium, NDSS. The Internet Society, 2019. 1

  4. [4]

    Stalker attacks Japanese pop singer – after tracking her down using reflection in her eyes,

    T. Claburn, “Stalker attacks Japanese pop singer – after tracking her down using reflection in her eyes,” The Register, 2019, october 10, 2019. 1

  5. [5]

    Anonymous social media app Yik Yak exposed users’ precise locations,

    L. Franceschi-Bicchierai, “Anonymous social media app Yik Yak exposed users’ precise locations,” VICE, 2022, may 12, 2022. [Online]. Available: https://www.vice.com/en/article/anonymous-social- media-app-yik-yak-exposed-users-precise-locations/ 1

  6. [6]

    IT manager has his bikes stolen after cycling app reveals his home address,

    G. Cluley, “IT manager has his bikes stolen after cycling app reveals his home address,” WeLiveSecurity, 2015, december 22, 2015. 1

  7. [7]

    Exploring and detecting self-disclosure in multi-modal posts on chinese social media,

    J. Luo, M. Liu, A. Huo, F. Hu, G. Li, and W. Peng, “Exploring and detecting self-disclosure in multi-modal posts on chinese social media,” inFindings of the Asso- ciation for Computational Linguistics: EMNLP 2025, Suzhou, China, November 4-9, 2025. Association for Computational Linguistics, 2025, pp. 21 510–21 527. 1, 2, 6

  8. [8]

    Private attribute inference from images with vision- language models,

    B. Tömekçe, M. Vero, R. Staab, and M. T. Vechev, “Private attribute inference from images with vision- language models,” inAdvances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, 2024. 1, 2

  9. [9]

    The eye of Sherlock Holmes: Uncovering user private attribute profiling via vision-language model agentic framework,

    F. Liu, Y . Zhang, X. Huang, Y . Peng, X. Li, L. Wang, Y . Shen, R. Duan, S. Qin, X. Jia, Q. Wen, and W. Dong, “The eye of Sherlock Holmes: Uncovering user private attribute profiling via vision-language model agentic framework,” inProceedings of the 33rd ACM Interna- tional Conference on Multimedia. ACM, 2025, pp. 4875–4883. 1, 2, 6

  10. [10]

    Xiaohongshu: Your lifestyle interest community,

    Xiaohongshu, “Xiaohongshu: Your lifestyle interest community,” 2026, accessed: 2026-05-25. [Online]. Available: https://www.xiaohongshu.com/ 1

  11. [11]

    Instagram,

    Instagram, “Instagram,” 2026, accessed: 2026-05-25. [Online]. Available: https://about.instagram.com/ 1

  12. [12]

    The inference to the best explanation,

    G. H. Harman, “The inference to the best explanation,” The Philosophical Review, vol. 74, no. 1, pp. 88–95,

  13. [13]

    Detective reasoning in criminal investigation: Integrating abduction, retro- duction, deduction, and induction into the national de- cision model (NDM),

    M. Jang, J. Sebire, and G. Stubbs, “Detective reasoning in criminal investigation: Integrating abduction, retro- duction, deduction, and induction into the national de- cision model (NDM),”International Journal of Police Science & Management, vol. 27, pp. 290–302, 2025. 2, 4

  14. [14]

    Be- yond memorization: Violating privacy via inference with large language models,

    R. Staab, M. Vero, M. Balunovic, and M. T. Vechev, “Be- yond memorization: Violating privacy via inference with large language models,” inThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net,

  15. [15]

    Large-scale online deanonymization with llms,

    S. Lermen, D. Paleka, J. Swanson, M. Aerni, N. Carlini, and F. Tramèr, “Large-scale online deanonymization with llms,”CoRR, vol. abs/2602.16800, 2026. 2

  16. [16]

    Automated profile inference with language model agents,

    Y . Du, Z. Li, B. Ding, Y . Li, H. Xiao, J. Zhou, and N. Li, “Automated profile inference with language model agents,”CoRR, vol. abs/2505.12402, 2025. 2

  17. [17]

    Auditing data member- ship in reinforcement learning with verifiable rewards,

    Y . Liu, H. Zhang, J. Zheng, Z. Sun, Z. Peng, J. Wei, T. Cong, Y . Yang, and X. He, “Auditing data member- ship in reinforcement learning with verifiable rewards,”

  18. [18]

    Client-side gradient inversion against federated learning from poisoning,

    J. Wei, Y . Zhang, L. Y . Zhang, C. Chen, S. Pan, K.- L. Ong, J. Zhang, and Y . Xiang, “Client-side gradient inversion against federated learning from poisoning,” CoRR, vol. abs/2309.07415, 2023. 2

  19. [19]

    CHASM: Unveiling covert ad- vertisements on chinese social media,

    J. Zheng, T. Hu, Y . Liu, Z. Sun, Z. Zhang, Z. Peng, W. Dong, and X. He, “CHASM: Unveiling covert ad- vertisements on chinese social media,” inAdvances in Neural Information Processing Systems, D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, Eds., vol. 38. Curran Associates, Inc.,

  20. [20]

    Combating the COVID-19 infodemic using prompt-based curricu- lum learning,

    Z. Peng, M. Li, Y . Wang, and G. T. S. Ho, “Combating the COVID-19 infodemic using prompt-based curricu- lum learning,”Expert Systems with Applications, vol. 229, p. 120501, 2023. 2

  21. [21]

    Prompt- based contrastive learning to combat the COVID-19 infodemic,

    Z. Peng, M. Li, Y . Wang, and D. Y . Mo, “Prompt- based contrastive learning to combat the COVID-19 infodemic,”Machine Learning, vol. 114, no. 1, p. 6,

  22. [22]

    Protect- ing vulnerable voices: Synthetic dataset generation for self-disclosure detection,

    S. Jangra, S. De, N. Sastry, and S. Fadaei, “Protect- ing vulnerable voices: Synthetic dataset generation for self-disclosure detection,” inSocial Networks Analysis and Mining - 17th International Conference, ASONAM 2025, Niagara Falls, ON, Canada, August 25-28, 2025, Proceedings, Part II, ser. Lecture Notes in Computer Science. Springer, 2025, pp. 3–18. 2, 6

  23. [23]

    PII-VisBench: Evaluating personally identifi- able information safety in vision language models along a continuum of visibility,

    G. M. Shahariar, Z. A. Nazi, M. O. H. Bhuiyan, and Z. Shi, “PII-VisBench: Evaluating personally identifi- able information safety in vision language models along a continuum of visibility,”CoRR, vol. abs/2601.05739,

  24. [24]

    Assessing visual privacy risks in multi- modal AI: A novel taxonomy-grounded evaluation of vision-language models,

    E. Tsaprazlis, T. Feng, A. Ramakrishna, R. Gupta, and S. Narayanan, “Assessing visual privacy risks in multi- modal AI: A novel taxonomy-grounded evaluation of vision-language models,”CoRR, vol. abs/2509.23827,

  25. [25]

    Unsafe LLM-Based search: Quantitative analy- sis and mitigation of safety risks in AI web search,

    Z. Luo, Z. Peng, Y . Liu, Z. Sun, M. Li, J. Zheng, and X. He, “Unsafe LLM-Based search: Quantitative analy- sis and mitigation of safety risks in AI web search,” in 8 34th USENIX Security Symposium (USENIX Security 25). Seattle, WA: USENIX Association, Aug. 2025, pp. 8055–8074. 2

  26. [26]

    Recognition through reasoning: Reinforcing image geo-localization with large vision-language models,

    L. Li, Y . Zhou, Y . Liang, F. Tsung, and J. Wei, “Recognition through reasoning: Reinforcing image geo-localization with large vision-language models,” in Advances in Neural Information Processing Systems, vol. 38, 2025. 2

  27. [27]

    Automatic dataset construction (ADC): Sam- ple collection, data curation, and beyond,

    M. Liu, Z. Di, J. Wei, Z. Wang, H. Zhang, R. Xiao, H. Wang, J. Pang, H. Chen, A. Shah, H. Wei, X. He, Z. Zhao, H. Wang, L. Feng, J. Wang, J. Davis, and Y . Liu, “Automatic dataset construction (ADC): Sam- ple collection, data curation, and beyond,”CoRR, vol. abs/2408.11338, 2024. 3

  28. [28]

    Gemini 3.1 Pro model card,

    DeepMind, “Gemini 3.1 Pro model card,” Deep- Mind, Tech. Rep., February 2026. [Online]. Avail- able: https://storage.googleapis.com/deepmind-media/ Model-Cards/Gemini-3-1-Pro-Model-Card.pdf 3

  29. [29]

    TxSum: User- centered ethereum transaction understanding with micro- level semantic grounding,

    Z. Peng, J. Zheng, Y . Liu, H. Jia, Q. Ye, J. Liu, X. Yang, M. Li, Q. Gong, X. Wang, and X. He, “TxSum: User- centered ethereum transaction understanding with micro- level semantic grounding,” 2026. 4

  30. [30]

    GasAgent: A multi-agent framework for automated gas optimization in smart contracts,

    J. Zheng, Z. Peng, Y . Liu, J. Wang, Y . Liao, W. Dong, and X. He, “GasAgent: A multi-agent framework for automated gas optimization in smart contracts,” 2025. 4

  31. [31]

    PaddleOCR-VL-1.5: Towards a multi-task 0.9b VLM for robust in-the-wild document parsing,

    C. Cui, T. Sun, S. Liang, T. Gao, Z. Zhang, J. Liu, X. Wang, C. Zhou, H. Liu, M. Lin, Y . Zhang, Y . Zhang, Y . Liu, D. Yu, and Y . Ma, “PaddleOCR-VL-1.5: Towards a multi-task 0.9b VLM for robust in-the-wild document parsing,”CoRR, vol. abs/2601.21957, 2026. 6

  32. [32]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” in Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, ser. Proceedings of Machine ...

  33. [33]

    Attribute inference attacks in online social networks,

    N. Z. Gong and B. Liu, “Attribute inference attacks in online social networks,”ACM Trans. Priv. Secur., vol. 21, no. 1, pp. 3:1–3:30, 2018. 10

  34. [34]

    US-centric vs. international personally identifiable information: A com- parison using the UT CID identity ecosystem,

    R. Rana, R. N. Zaeem, and K. S. Barber, “US-centric vs. international personally identifiable information: A com- parison using the UT CID identity ecosystem,” in2018 International Carnahan Conference on Security Tech- nology, ICCST 2018, Montreal, QC, Canada, October 22-25, 2018. IEEE, 2018, pp. 1–5. 10

  35. [35]

    A survey on privacy in social me- dia: Identification, mitigation, and applications,

    G. Beigi and H. Liu, “A survey on privacy in social me- dia: Identification, mitigation, and applications,”Trans. Data Sci., vol. 1, no. 1, pp. 7:1–7:38, 2020. 10

  36. [36]

    Privacy as contextual integrity,

    H. Nissenbaum, “Privacy as contextual integrity,”Wash- ington Law Review, vol. 79, no. 1, pp. 119–157, 2004. 10

  37. [37]

    k-anonymity: A model for protecting pri- vacy,

    L. Sweeney, “k-anonymity: A model for protecting pri- vacy,”International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 557– 570, 2002. 11

  38. [38]

    Automated annotation with generative AI requires validation,

    N. Pangakis, S. Wolken, and N. Fasching, “Automated annotation with generative AI requires validation,”arXiv preprint arXiv:2306.00176, 2023. 11

  39. [39]

    Judging LLM-as-a-judge with MT-bench and chatbot arena,

    L. Zheng, W.-L. Chiang, Y . Sheng, S. Zhuang, Z. Wu, Y . Zhuang, Z. Lin, Z. Li, D. Li, E. Xing, H. Zhang, J. E. Gonzalez, and I. Stoica, “Judging LLM-as-a-judge with MT-bench and chatbot arena,” inAdvances in Neural Information Processing Systems (NeurIPS), 2023. 11

  40. [40]

    Measuring and reducing LLM hallucination without gold-standard answers via expertise-weighting,

    J. Wei, Y . Yao, J.-F. Ton, H. Guo, A. Estornell, and Y . Liu, “Measuring and reducing LLM hallucination without gold-standard answers via expertise-weighting,” CoRR, vol. abs/2402.10412, 2024. 11

  41. [41]

    A style-based generator architecture for generative adversarial networks,

    T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 12, pp. 4217–4228, 2021. 13

  42. [42]

    Google land- marks dataset v2 - A large-scale benchmark for instance- level recognition and retrieval,

    T. Weyand, A. Araújo, B. Cao, and J. Sim, “Google land- marks dataset v2 - A large-scale benchmark for instance- level recognition and retrieval,” in2020 IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. Com- puter Vision Foundation / IEEE, 2020, pp. 2572–2581. 13

  43. [43]

    A technique for the measurement of attitudes,

    R. Likert, “A technique for the measurement of attitudes,” Archives of Psychology, vol. 22, no. 140, pp. 1–55, 1932. 14

  44. [44]

    teacher” poses minimal risk, whereas “undercover police officer

    D. M. Green and J. A. Swets,Signal detection theory and psychophysics. John Wiley & Sons, 1966, vol. 1. 14 A SopriBench Construction Details This appendix expands the construction details abbreviated in Section 3. This section covers the private-corpus sampling procedure, account filtering, de-identified pattern abstraction, synthetic profile and leakage-...

  45. [45]

    A fictional hidden user profile

  46. [46]

    A leakage plan assigning synthetic clue fragments to posts, modalities, and carriers

  47. [47]

    Shenzhen

    Ten public post scripts with ordinary social intent and incidental privacy cues. User profile generation.The profile generation prompt in- structs the LLM to produce a coherent hidden profile with 28 attributes. The attributes cover basic identity, socioeconomic status, lifestyle and routines, and sensitive or legally protected information. Each attribute...

  48. [48]

    If slot_match=false, set granularity_score=0 and matched_level=none

  49. [49]

    If slot_match=true, identify the deepest correct hierarchy level matched by the predicted value and assign granu- larity_score

  50. [50]

    Do not infer contextual sensitivity or inference difficulty

    Return a structured JSON object with slot_match, matched_slot, granularity_score, matched_level, and a brief justification. Do not infer contextual sensitivity or inference difficulty. These are provided by the benchmark. Value-granularity score.The value-granularity score g j is computed by an LLM judge using an attribute-specific hier- archy. The judge ...