pith. machine review for the scientific record. sign in

arxiv: 2604.19410 · v2 · submitted 2026-04-21 · 💻 cs.HC

Recognition: unknown

Understanding Password Preferences, Memorability, and Security through a Human-Centered Lens

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:57 UTC · model grok-4.3

classification 💻 cs.HC
keywords password securityeye-trackingAI-generated passwordsmemorabilityuser preferencesvisual attentionpassword entropyhuman-centered design
0
0 comments X

The pith

Users who pay more visual attention to website context during password creation produce stronger, higher-entropy passwords even while preferring their own weaker passwords over AI-generated alternatives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies how people create, select, and try to remember passwords when given options from AI models and a rule-based generator. It uses eye-tracking to measure behavior across different website contexts and compares objective strength with user preferences and recall. The work confirms the usual strength-memorability trade-off but adds that attention to contextual cues on the site tracks with better password entropy. A reader would care because passwords remain the main login method and this points to a behavioral lever that might improve security without forcing users to adopt unmemorable strings.

Core claim

Despite AI-generated passwords showing higher strength, participants still chose their self-generated passwords; eye-tracking data further showed a significant positive correlation between visual attention directed at contextual cues and the entropy of the passwords participants ultimately created.

What carries the argument

Eye-tracked visual attention to contextual cues on the presented website, which correlates with and appears to support higher password entropy during the creation task.

If this is right

  • Interfaces that encourage attention to context could raise password entropy without changing the generator itself.
  • AI password tools need to address memorability to overcome user preference for self-created passwords.
  • The strength-usability trade-off continues even when advanced generation models are available.
  • Security quality depends on both the generation method and how users engage with the surrounding site details.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Password creation screens could add subtle visual highlights around context elements to steer attention and entropy upward.
  • The same attention-entropy link might appear in other security tasks such as choosing recovery questions or setting up two-factor methods.
  • Real-world field tests outside the lab would show whether the correlation survives when users create passwords for their actual accounts.
  • Training or interface prompts that increase focus on context might offer a low-cost way to improve password strength across populations.

Load-bearing premise

The correlation between measured visual attention to contextual cues and password entropy in this lab setup with specific generators and sites reflects a stable relationship that could be used to guide real-world security design.

What would settle it

A follow-up study that removes contextual website information or eye-tracking measures and finds no remaining link between any attention proxy and password entropy.

Figures

Figures reproduced from arXiv: 2604.19410 by Duru Paker, Enkelejda Kasneci, Suleyman Ozdel.

Figure 1
Figure 1. Figure 1: The Visual Anchoring Effect and the Security-Usability Trade-off. (A) Our novel eye-tracking finding reveals that [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Examples of AOI logging, red boxes were not visible [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Boxplot overview. Top row: objective entropy (a) and subjective strength ratings (b) across password models. Bottom [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

Passwords remain the primary authentication method, yet user-created passwords are often the weakest due to the security-usability trade-off. Although AI-based password generators are emerging, little is known about their effectiveness and user perceptions. This eye-tracking study examined how behavior during password creation, selection, and memorization relates to objective and subjective password quality. Four password models, three AI-based (DeepSeek-API, ChatGPT-API, PassGPT) and one rule-based random generator, generated suggestions from participants' self-generated passwords across four website contexts. Eye movements were recorded throughout the experiment. Results confirm the expected trade-off between AI-generated password strength and human memorability but also reveal a novel behavioral link. Despite stronger AI-generated passwords, participants favored self-generated ones. Notably, visual attention to contextual cues was significantly correlated with higher password entropy. This suggests that security is shaped not only by the generation tool but also by users' visual engagement with contextual cues, highlighting the potential of attention-driven security design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper reports results from an eye-tracking study in which participants created, selected, and memorized passwords using suggestions generated by three AI-based models (DeepSeek-API, ChatGPT-API, PassGPT) and one rule-based random generator across four website contexts. It confirms the expected security-usability trade-off, notes that participants preferred their own self-generated passwords despite the higher entropy of AI suggestions, and reports a novel finding that visual attention to contextual cues (measured via eye movements) was significantly correlated with higher password entropy.

Significance. If the reported correlation is supported by transparent statistical reporting and appropriate controls, the work could contribute to HCI research on attention-aware security interfaces. The comparison across multiple AI generators and the use of objective eye-tracking metrics add empirical value, though the lab setting limits direct claims about real-world applicability.

major comments (2)
  1. [Results] Results section: the abstract states that 'visual attention to contextual cues was significantly correlated with higher password entropy' but provides no sample size, exact statistical test (e.g., Pearson r, Spearman rho), coefficient value, p-value, effect size, or mention of corrections for multiple comparisons or individual-difference covariates. These details are required to evaluate whether the central correlational claim is robust.
  2. [Methods] Methods section: the description of how 'visual attention to contextual cues' was operationalized (e.g., AOI definitions, fixation-duration thresholds, or proportion of dwell time) is not specified in the provided summary. Without this, it is impossible to assess whether the correlation reflects a genuine design lever or an artifact of measurement choices.
minor comments (2)
  1. [Methods] The four website contexts should be named explicitly (e.g., banking, social media) with justification for their selection, as context may moderate both attention patterns and password strategy.
  2. [Methods] Clarify whether the eye-tracking data were pre-processed for blinks, calibration quality, or participant exclusion criteria, and report the final N after any exclusions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight opportunities to improve the transparency of our statistical reporting and methodological details. We have revised the manuscript accordingly to strengthen the presentation of our eye-tracking findings on password creation.

read point-by-point responses
  1. Referee: [Results] Results section: the abstract states that 'visual attention to contextual cues was significantly correlated with higher password entropy' but provides no sample size, exact statistical test (e.g., Pearson r, Spearman rho), coefficient value, p-value, effect size, or mention of corrections for multiple comparisons or individual-difference covariates. These details are required to evaluate whether the central correlational claim is robust.

    Authors: We appreciate this feedback on reporting clarity. The results section contains the full statistical analysis of the correlation, but we acknowledge that the abstract does not summarize these elements. In the revised manuscript, we will update the abstract to report the sample size, the specific correlation test and its coefficient, p-value, effect size, and any controls or corrections applied. This will allow readers to better assess the robustness of the finding without altering the underlying data or conclusions. revision: yes

  2. Referee: [Methods] Methods section: the description of how 'visual attention to contextual cues' was operationalized (e.g., AOI definitions, fixation-duration thresholds, or proportion of dwell time) is not specified in the provided summary. Without this, it is impossible to assess whether the correlation reflects a genuine design lever or an artifact of measurement choices.

    Authors: Thank you for noting this gap in detail. While the methods section describes the overall eye-tracking procedure, we agree that the operationalization of visual attention to contextual cues requires more explicit description. In the revised version, we will expand the methods to define the Areas of Interest (AOIs) for contextual elements, specify the fixation duration threshold, and clarify that attention was quantified as the proportion of dwell time on those AOIs. This will enable readers to evaluate the measure's validity. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical correlation study

full rationale

This paper reports results from a controlled eye-tracking experiment measuring user behavior during password creation with four generators across website contexts. The central claim is a reported correlation between visual attention to contextual cues and higher password entropy, presented as an observational finding rather than a derivation. No equations, fitted parameters, self-citations, or ansatzes are invoked to generate or justify the result; the findings rest directly on collected data (eye movements, entropy metrics, preference ratings) without any reduction to prior inputs by construction. The study is self-contained against external benchmarks as a standard behavioral HCI experiment.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical user study with no free parameters, no invented entities, and only standard statistical assumptions for interpreting eye-tracking and entropy correlations.

axioms (1)
  • standard math Standard assumptions of correlation analysis and significance testing hold for the eye-tracking and entropy measures
    Invoked implicitly when reporting significant correlation between attention and entropy

pith-pipeline@v0.9.0 · 5476 in / 1068 out tokens · 43385 ms · 2026-05-10T01:57:07.470708+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 32 canonical work pages

  1. [1]

    Yasmeen Abdrabou, Yomna Abdelrahman, Mohamed Khamis, and Florian Alt

  2. [2]

    doi:10.1145/3411763.3451636

    Think Harder! Investigating the Effect of Password Strength on Cognitive Load during Password Creation.Conference on Human Factors in Computing Systems - Proceedings(5 2021). doi:10.1145/3411763.3451636

  3. [3]

    Because AI is 100% right and safe

    Yasmeen Abdrabou, Johannes Schütte, Ahmed Shams, Ken Pfeuffer, Daniel Buschek, Mohamed Khamis, and Florian Alt. 2022. "Your Eyes Tell You Have Used This Password Before": Identifying Password Reuse from Gaze and Keystroke Dynamics.Conference on Human Factors in Computing Systems - Proceedings (4 2022). doi:10.1145/3491102.3517531/SUPPL_FILE/3491102.351753...

  4. [4]

    Yasmeen Abdrabou, Ahmed Shams, Mohamed Omar Mantawy, Anam Ahmad Khan, Mohamed Khamis, Florian Alt, and Yomna Abdelrahman. 2021. GazeMeter: Exploring the Usage of Gaze Behaviour to Enhance Password Assessments.Eye Tracking Research and Applications Symposium (ETRA)PartF169256 (5 2021). doi:10.1145/3448017.3457384

  5. [5]

    DeepSeek AI. 2025. DeepSeek-Chat. https://www.deepseek.com Accessed via API at https://api.deepseek.com, August 2025

  6. [6]

    Mary Alavanza. 2024. Singapore-based firm fined S$74K for data breach due to weak password affecting over 500K users - Singapore News. https://theindependent.sg/singapore-based-firm-fined-s74k-for-data- breach-due-to-weak-password-affecting-over-500k-users/

  7. [7]

    Xavier De Carné De Carnavalet and Mohammad Mannan. 2014. From Very Weak to Very Strong: Analyzing Password-Strength Meters. InNDSS Sympo- sium. https://www.ndss-symposium.org/ndss2014/ndss-2014-programme/very- weak-very-strong-analyzing-password-strength-meters/

  8. [8]

    Siyuan Chen and Julien Epps. 2014. Using Task-Induced Pupil Diameter and Blink Rate to Infer Cognitive Load.Human–Computer Interaction29 (7 2014), 390–413. Issue 4. doi:10.1080/07370024.2014.892428

  9. [9]

    Siyuan Chen, Julien Epps, Natalie Ruiz, and Fang Chen. 2011. Eye activity as a measure of human mental effort in HCI.International Conference on Intelligent User Interfaces, Proceedings IUI(2011), 315–318. doi:10.1145/1943403.1943454

  10. [10]

    Kuei Pin Chien, Cheng Yue Tsai, Hsiu Ling Chen, Wen Hua Chang, and Sufen Chen. 2015. Learning differences and eye fixation patterns in virtual and physical science laboratories.Computers & Education82 (3 2015), 191–201. doi:10.1016/J. COMPEDU.2014.11.023

  11. [11]

    Cubrilovic

    N. Cubrilovic. 2009. RockYou Hack: From Bad To Worse. https://techcrunch.com/ 2009/12/14/rockyou-hack-security-myspace-facebook-passwords/. Accessed: 2025-09-09

  12. [12]

    Xinchun Cui, Comxueqing Li, Yiming Qin, and Yong Ding. 2020. A Password Strength Evaluation Algorithm Based on Sensitive Personal Information. In2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). 1542–1545. doi:10.1109/TrustCom50675.2020.00211

  13. [13]

    Matteo Dell’Amico and Maurizio Filippone. 2015. Monte Carlo Strength Evalua- tion: Fast and Reliable Password Checking. InProceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security(New York, NY, USA). As- sociation for Computing Machinery, 158–169. doi:10.1145/2810103.2813631

  14. [14]

    Rahul Gavas, Debatri Chatterjee, and Aniruddha Sinha. 2017. Estimation of cognitive load based on the pupil size dilation.2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 20172017-January (11 2017), 1499–1504. doi:10.1109/SMC.2017.8122826

  15. [15]

    Goldberg and Xerxes P

    Joseph H. Goldberg and Xerxes P. Kotval. 1999. Computer interface evaluation using eye movements: methods and constructs.International Journal of Industrial Ergonomics24 (10 1999), 631–645. Issue 6. doi:10.1016/S0169-8141(98)00068-7

  16. [16]

    Paul A Grassi, James L Fenton, Elaine M Newton, Ray A Perlner, Andrew R Re- genscheid, William E Burr, Justin P Richer, Naomi B Lefkovitz, Jamie M Danker, Yee-Yin Choong, Kristen K Greene, and Mary F Theofanos. 2017. Withdrawn NIST Technical Series Publication Warning Notice Withdrawn Publication Se- ries/Number NIST Special Publication 800-63B Title Digi...

  17. [17]

    Briland Hitaj, Paolo Gasti, Giuseppe Ateniese, and Fernando Perez-Cruz. 2019. PassGAN: A deep learning approach for password guessing.Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)11464 LNCS (2019), 217–237. doi:10.1007/978-3- 030-21568-2_11 Understanding Password Prefer...

  18. [18]

    Philip G Inglesant and M Angela Sasse. 2010. The true cost of unusable password policies: password use in the wild. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(New York, NY, USA). Association for Computing Machinery, 383–392. doi:10.1145/1753326.1753384

  19. [19]

    Enkelejda Kasneci, Hong Gao, Suleyman Ozdel, Virmarie Maquiling, Enkeleda Thaqi, Carrie Lau, Yao Rong, Gjergji Kasneci, and Efe Bozkir. 2024. Introduction to Eye Tracking: A Hands-On Tutorial for Students and Practitioners. (4 2024). https://arxiv.org/pdf/2404.15435

  20. [20]

    Jung-Chun Liu, Kuei-An Li, Su-Ling Yeh, and Shao-Yi Chien. 2022. Assessing Perceptual Load and Cognitive Load by Fixation-Related Information of Eye Movements.Sensors22 (02 2022), 1187. doi:10.3390/s22031187

  21. [21]

    William Melicher, Blase Ur, Sean M Segreti, Saranga Komanduri, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2016. Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks. In25th USENIX Secu- rity Symposium (USENIX Security 16)(Austin, TX). USENIX Association, 175–

  22. [22]

    https://www.usenix.org/conference/usenixsecurity16/technical-sessions/ presentation/melicher

  23. [23]

    Najafabadi, Taghi M

    Maryam M. Najafabadi, Taghi M. Khoshgoftaar, Clifford Kemp, Naeem Seliya, and Richard Zuech. 2014. Machine learning for detecting brute force attacks at the network level.Proceedings - IEEE 14th International Conference on Bioinformatics and Bioengineering, BIBE 2014(2 2014), 379–385. doi:10.1109/BIBE.2014.73

  24. [24]

    Thi Thu Trang Nguyen and Quang Uy Nguyen. 2015. An analysis of Persuasive Text Passwords.Proceedings of 2015 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science, NICS 2015(10 2015), 28–33. doi:10.1109/NICS.2015.7302207

  25. [25]

    OpenAI. 2024. GPT-4o. https://openai.com/index/hello-gpt-4o/ Accessed via OpenAI API, August 2025

  26. [26]

    Suleyman Ozdel, Efe Bozkir, and Enkelejda Kasneci. 2024. Privacy-preserving scanpath comparison for pervasive eye tracking.Proceedings of the ACM on Human-Computer Interaction8, ETRA (2024), 1–28. doi:10.1145/3655605

  27. [27]

    Bijeeta Pal, Tal Daniel, Rahul Chatterjee, and Thomas Ristenpart. 2019. Beyond Credential Stuffing: Password Similarity Models Using Neural Networks. In2019 IEEE Symposium on Security and Privacy (SP). 417–434. doi:10.1109/SP.2019.00056

  28. [28]

    Proctor, Mei Ching Lien, Kim Phuong L

    Robert W. Proctor, Mei Ching Lien, Kim Phuong L. Vu, E. Eugene Schultz, and Gavriel Salvendy. 2002. Improving computer security for authentication of users: Influence of proactive password restrictions.Behavior Research Methods, Instruments, and Computers34 (2002), 163–169. Issue 2. doi:10.3758/BF03195438

  29. [29]

    Javier Rando, Fernando Perez-Cruz, and Briland Hitaj. 2023. javirandor/passgpt- 16characters·Hugging Face. https://huggingface.co/javirandor/passgpt- 16characters Published: June 2, 2023; Accessed: 26 January 2026

  30. [30]

    Javier Rando, Fernando Perez-Cruz, and Briland Hitaj. 2023. PassGPT: Password Modeling and (Guided) Generation with Large Language Models. (6 2023). https: //arxiv.org/abs/2306.01545v2

  31. [31]

    Salvucci and Joseph H

    Dario D. Salvucci and Joseph H. Goldberg. 2000. Identifying fixations and sac- cades in eye-tracking protocols.Proceedings of the Eye Tracking Research and Applications Symposium 2000(2000), 71–78. doi:10.1145/355017.355028

  32. [32]

    C. E. Shannon. 1948. A Mathematical Theory of Communication.Bell System Tech- nical Journal27 (1948), 379–423. Issue 3. doi:10.1002/J.1538-7305.1948.TB01338.X

  33. [33]

    Richard Shay, Saranga Komanduri, Adam L Durity, Phillip (Seyoung) Huh, Michelle L Mazurek, Sean M Segreti, Blase Ur, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2016. Designing Password Policies for Strength and Usability.ACM Trans. Inf. Syst. Secur.18 (5 2016). Issue 4. doi:10.1145/2891411

  34. [34]

    Richard Shay, Saranga Komanduri, Patrick Gage Kelley, Pedro Giovanni Leon, Michelle L Mazurek, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2010. Encountering stronger password requirements: user attitudes and behaviors. In Proceedings of the Sixth Symposium on Usable Privacy and Security(New York, NY, USA). Association for Computing Machinery. ...

  35. [35]

    Xingyu Su, Xiaojie Zhu, Yang Li, Yong Li, Chi Chen, and Paulo Esteves-Veríssimo

  36. [36]

    (4 2024)

    PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer. (4 2024). http://arxiv.org/abs/2404.04886

  37. [37]

    Taha, Taqwa A

    Mariam M. Taha, Taqwa A. Alhaj, Ala E. Moktar, Azza H. Salim, and Settana M. Abdullah. 2013. On password strength measurements: Password entropy and password quality.Proceedings - 2013 International Conference on Computer, Electri- cal and Electronics Engineering: ’Research Makes a Difference’, ICCEEE 2013(2013), 497–501. doi:10.1109/ICCEEE.2013.6633989

  38. [38]

    Binh Le Thanh Thai and Hidema Tanaka. 2024. A statistical Markov-based password strength meter.Internet of Things25 (2024), 101057. doi:10.1016/j.iot. 2023.101057

  39. [39]

    Tobii. 2023. Tobii Pro Lab Gaze Filter. https://connect.tobii.com/s/article/Gaze- Filter-functions-and-effects?language=en_US Published: June 30, 2023

  40. [40]

    Blase Ur, Patrick Kelley, Saranga Komanduri, Joel Lee, Michael Maass, Michelle Mazurek, Timothy Passaro, Richard Shay, Timothy Vidas, Lujo Bauer, Nicolas Christin, Lorrie Cranor, Serge Egelman, and Julio López. 2012. Helping users create better passwords.USENIX ;login:37 (12 2012). https://www.researchgate. net/publication/261723603_Helping_users_create_b...

  41. [41]

    Mazurek, Timothy Passaro, Richard Shay, Timothy Vidas, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor

    Blase Ur, Patrick Gage Kelley, Saranga Komanduri, Joel Lee, Michael Maass, Michelle L. Mazurek, Timothy Passaro, Richard Shay, Timothy Vidas, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2012.How Does Your Pass- word Measure Up? The Effect of Strength Meters on Password Creation. 65–80 pages. https://www.cs.bham.ac.uk/~garciaf/publications/Gone_...

  42. [42]

    Matt Weir, Sudhir Aggarwal, Michael Collins, and Henry Stern. 2010. Testing met- rics for password creation policies by attacking large sets of revealed passwords. InProceedings of the 17th ACM Conference on Computer and Communications Security(New York, NY, USA). Association for Computing Machinery, 162–175. doi:10.1145/1866307.1866327

  43. [43]

    Tao Zhang, Zelei Cheng, Yi Qin, Qiang Li, and Lin Shi. 2020. Deep Learning for Password Guessing and Password Strength Evaluation, A Survey. In2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). 1162–1166. doi:10.1109/TrustCom50675.2020.00155

  44. [44]

    Yinqian Zhang, Fabian Monrose, and Michael K. Reiter. 2010. The security of modern password expiration: An algorithmic framework and empirical analysis. Proceedings of the ACM Conference on Computer and Communications Security (2010), 176–186. doi:10.1145/1866307.1866328;WGROUP:STRING:ACM ETRA ’26, June 01–04, 2026, Marrakesh, Morocco Duru Paker, Suleyman...