pith. sign in

arxiv: 2606.07837 · v1 · pith:7WE7OYJWnew · submitted 2026-06-05 · 💻 cs.HC · cs.AI

Does Persona Make LLMs K-pop Fans? A Pilot Study of LLM-Based Online Concert Audience Agents

Pith reviewed 2026-06-27 20:41 UTC · model grok-4.3

classification 💻 cs.HC cs.AI
keywords LLM agentspersona conditioningK-popaudience simulationchat qualitysocial connectednessmulti-agent systemsonline concerts
0
0 comments X

The pith

Persona conditioning makes LLM-generated K-pop fan chat appear more natural without increasing social connectedness or engagement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether giving distinct fan identities to multiple LLM agents can recreate the shared audience feeling of a live concert when viewers watch recorded K-pop videos alone. It builds a system of ten agents that produce live chat messages and compares a version where each agent has a unique persona, bias, and style against a plain baseline without personas. In tests with eleven K-pop fans, the persona version produced higher-quality and more natural chat, yet viewers reported no gains in feeling connected, engaged, or emotionally affected. The work suggests that online fan chat often works more like a shared monologue than back-and-forth talk, and that real collective experience needs alignment with actual fandom identity.

Core claim

Persona conditioning substantially improved model-level chat quality and perceived naturalness, but did not translate into differences in social connectedness, engagement, or affective response. Interviews suggest that online K-pop concert chat may operate as collective monologue rather than interpersonal dialogue, and that meaningful participation depends on shared identification with the specific artist and fandom.

What carries the argument

A multi-agent system of ten LLM agents generating real-time fan chat messages next to a K-pop performance video, comparing agents given distinct fan identities and chat styles against a no-persona baseline.

Load-bearing premise

The study assumes that ratings from eleven participants in a within-subjects design can reliably detect or rule out differences in social connectedness, engagement, and affective response.

What would settle it

Re-running the experiment with a larger sample of K-pop fans and finding statistically significant gains in social connectedness or engagement scores for the persona condition would challenge the central claim.

Figures

Figures reproduced from arXiv: 2606.07837 by Hyojin Kim, Kirak Kim, Kyung Myun Lee, Sungyoung Kim, Yejin Son.

Figure 1
Figure 1. Figure 1: Model-level effects of persona conditioning. Each thin line represents one participant’s paired sessions across conditions, while black diamonds indicate group means. Persona conditioning produces strong model-level differentiation. Persona conditioning increases overall output diversity (Distinct-2: dz = +5.63; Self-BLEU-2: dz = −3.48, where lower is more diverse), reduces verbatim repetition (exact-repet… view at source ↗
Figure 2
Figure 2. Figure 2: System interface used in the study. Participants watched the performance video while interacting with the LLM-based audience chat interface on the right side of the screen. B. Prompt Design Details B.1. Global System Prompt The following prompt is shared by all agents in both conditions. In the persona condition, each agent’s persona text (Appendix B.2) is appended below it. You are a LOONA fan watching a … view at source ↗
read the original abstract

A concert is a collective experience, but recorded performance videos are typically watched alone, stripping away the shared audience presence that makes concerts feel eventful. We investigate whether persona-based LLM audience agents can recreate aspects of this collective experience by generating real-time fan chat alongside a K-pop performance video. We present a multi-agent system in which ten LLM agents react through live-chat messages, comparing a persona-conditioned audience (each agent assigned a distinct fan identity, bias, and chat style) with a no-persona baseline. In a within-subjects pilot with K-pop fans (N=11), persona conditioning substantially improved model-level chat quality and perceived naturalness, but did not translate into differences in social connectedness, engagement, or affective response. Interviews suggest that online K-pop concert chat may operate as collective monologue rather than interpersonal dialogue, and that meaningful participation depends on shared identification with the specific artist and fandom. Persona conditioning can make LLM audiences appear more natural, but culturally meaningful collective experience may require deeper alignment between persona, crowd behavior, fandom identity, and user expectations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a pilot study of a multi-agent LLM system that generates real-time fan chat alongside K-pop performance videos. It compares a persona-conditioned condition (ten agents each assigned distinct fan identities, biases, and chat styles) against a no-persona baseline. In a within-subjects experiment with N=11 K-pop fans, persona conditioning improved model-level chat quality and perceived naturalness, but produced no detectable differences in social connectedness, engagement, or affective response. Qualitative interviews suggest online concert chat functions more as collective monologue than interpersonal dialogue and requires shared fandom identification for meaningful participation.

Significance. If the results hold, the work supplies initial empirical evidence that persona conditioning can enhance surface-level naturalness of LLM agents but may be insufficient for recreating culturally meaningful collective experiences in virtual fan settings. The pilot design and interview insights offer concrete directions for HCI research on LLM-based social simulations, particularly the distinction between monologue-style chat and dialogue-based engagement.

major comments (2)
  1. [Results (statistical analysis of null effects)] The central claim that persona conditioning 'did not translate into differences' in social connectedness, engagement, or affective response rests on N=11 within-subjects data. No power analysis, effect sizes, confidence intervals, or equivalence tests are reported for these null results, leaving open the possibility that practically meaningful effects went undetected.
  2. [Methods (dependent measures)] The measures of social connectedness, engagement, and affective response are used to support the key negative finding, yet the manuscript provides no information on their validation, reliability, or sensitivity in the context of a collective-experience hypothesis.
minor comments (2)
  1. [Abstract] The abstract states the sample size and design but could more explicitly note that the study is a pilot when summarizing the null results.
  2. [Methods] Clarify how 'model-level chat quality' was operationalized and scored, including any inter-rater or automated metrics used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the insightful comments on our pilot study. We agree that the small N and lack of statistical details for null effects, as well as missing information on measure validation, are limitations that warrant clarification and expansion in the manuscript. Below we respond point-by-point and commit to revisions that strengthen reporting while preserving the exploratory intent of the pilot.

read point-by-point responses
  1. Referee: [Results (statistical analysis of null effects)] The central claim that persona conditioning 'did not translate into differences' in social connectedness, engagement, or affective response rests on N=11 within-subjects data. No power analysis, effect sizes, confidence intervals, or equivalence tests are reported for these null results, leaving open the possibility that practically meaningful effects went undetected.

    Authors: We acknowledge the validity of this concern. The study was designed and presented as a pilot (N=11) to assess feasibility of the multi-agent system and to collect qualitative insights on fan chat dynamics, not to deliver powered hypothesis tests. That said, we agree that null results require more transparent reporting. In the revised manuscript we will add effect sizes (e.g., Cohen’s d) and 95% confidence intervals for all between-condition comparisons on the three dependent variables. We will also expand the limitations section to explicitly discuss the low statistical power and the risk of undetected effects, framing the quantitative results as preliminary. Post-hoc power analysis and equivalence testing can be included if the editor deems them essential, though we note that pre-specifying equivalence bounds would be more appropriate for a confirmatory follow-up study. revision: yes

  2. Referee: [Methods (dependent measures)] The measures of social connectedness, engagement, and affective response are used to support the key negative finding, yet the manuscript provides no information on their validation, reliability, or sensitivity in the context of a collective-experience hypothesis.

    Authors: The referee correctly identifies a gap in the current draft. The scales were drawn from established instruments in the social-presence and audience-experience literature; however, the manuscript does not cite their origins, report reliability in our sample, or discuss their sensitivity to collective fandom experiences. In revision we will (1) add citations and brief validation history for each measure, (2) report internal-consistency statistics (Cronbach’s alpha) calculated from the N=11 data, and (3) include a short paragraph addressing the measures’ applicability and potential limitations when testing a collective-experience hypothesis in a virtual-concert setting. These additions will better contextualize the null quantitative findings. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical user study with direct observations

full rationale

This is a straightforward empirical pilot study: the authors implement a multi-agent LLM chat system, run a within-subjects comparison of two conditions (persona vs. baseline), collect ratings and interviews from N=11 participants, and report observed differences (or lack thereof) on chat quality, naturalness, connectedness, engagement, and affect. No equations, parameter fittings presented as predictions, self-definitional constructs, or load-bearing self-citations appear in the provided text. All claims rest on direct measurement rather than reduction to prior inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work is an empirical pilot relying on domain assumptions about LLM simulation of human behavior and standard HCI user-study practices rather than mathematical derivations or new entities.

axioms (1)
  • domain assumption LLM agents can approximate real fan chat behavior when conditioned with personas
    This premise enables the core comparison between persona and no-persona conditions in the multi-agent system.

pith-pipeline@v0.9.1-grok · 5731 in / 1227 out tokens · 23013 ms · 2026-06-27T20:41:15.369635+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references

  1. [1]

    2022 , publisher=

    Liveness: Performance in a mediatized culture , author=. 2022 , publisher=

  2. [2]

    International Journal of Human--Computer Interaction , volume=

    Exploring the Dynamics of Chat Interaction in Live-Streamed Idol Concerts , author=. International Journal of Human--Computer Interaction , volume=. 2025 , publisher=

  3. [3]

    The Routledge companion to music cognition , pages=

    Music alone and with others: Listening, sharing, and celebrating , author=. The Routledge companion to music cognition , pages=. 2017 , publisher=

  4. [4]

    Music has no borders

    “Music has no borders”: An exploratory study of audience engagement with YouTube music broadcasts during COVID-19 lockdown, 2020 , author=. Frontiers in Psychology , volume=. 2021 , publisher=

  5. [5]

    Qualitative research in psychology , volume=

    Using thematic analysis in psychology , author=. Qualitative research in psychology , volume=. 2006 , publisher=

  6. [6]

    Frontiers in Psychology , volume=

    Audience reconstructed: social media interaction by BTS fans during live stream concerts , author=. Frontiers in Psychology , volume=. 2024 , publisher=

  7. [7]

    Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems , pages=

    Concert interaction translation: Augmenting VR live concert experience using chat-driven artificial collective reactions , author=. Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems , pages=

  8. [8]

    Proceedings of the 33rd ACM International Conference on Multimedia , pages=

    Bring the VibeOn: Designing a Multimodal Interface for Shared Emotional Experiences in Live-streamed Concerts , author=. Proceedings of the 33rd ACM International Conference on Multimedia , pages=

  9. [9]

    International Journal of Human--Computer Interaction , volume=

    Watching a movie alone yet together: understanding reasons for watching Danmaku videos , author=. International Journal of Human--Computer Interaction , volume=. 2017 , publisher=

  10. [10]

    ACM Transactions on Social Computing , volume=

    Danmaku: A new paradigm of social interaction via online videos , author=. ACM Transactions on Social Computing , volume=. 2019 , publisher=

  11. [11]

    Proceedings of the ACM on Human-Computer Interaction , volume=

    The virtual concert-goer: Audience perspectives on remote music performances , author=. Proceedings of the ACM on Human-Computer Interaction , volume=. 2025 , publisher=

  12. [12]

    Personality and Social Psychology Bulletin , volume=

    Let the music play: Live music fosters collective effervescence and leads to lasting positive outcomes , author=. Personality and Social Psychology Bulletin , volume=. 2026 , publisher=

  13. [13]

    Advances in Neural Information Processing Systems , volume=

    Llm generated persona is a promise with a catch , author=. Advances in Neural Information Processing Systems , volume=

  14. [14]

    Findings of the Association for Computational Linguistics: ACL 2024 , pages=

    Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

  15. [15]

    Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

    When” a helpful assistant” is not really helpful: Personas in system prompts do not improve performances of large language models , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

  16. [16]

    Political Analysis , volume=

    Out of one, many: Using language models to simulate human samples , author=. Political Analysis , volume=. 2023 , publisher=

  17. [17]

    2024 , institution=

    Automated social science: Language models as scientist and subjects , author=. 2024 , institution=

  18. [18]

    Proceedings of the 36th annual acm symposium on user interface software and technology , pages=

    Generative agents: Interactive simulacra of human behavior , author=. Proceedings of the 36th annual acm symposium on user interface software and technology , pages=

  19. [19]

    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

    Character-llm: A trainable agent for role-playing , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

  20. [20]

    Characterbox: Evaluating the role-playing capabilities of llms in text-based virtual worlds , author=. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

  21. [21]

    , author=

    Inclusion of other in the self scale and the structure of interpersonal closeness. , author=. Journal of personality and social psychology , volume=. 1992 , publisher=

  22. [22]

    International Journal of Human-Computer Studies , volume=

    A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form , author=. International Journal of Human-Computer Studies , volume=. 2018 , publisher=

  23. [23]

    Journal of behavior therapy and experimental psychiatry , volume=

    Measuring emotion: the self-assessment manikin and the semantic differential , author=. Journal of behavior therapy and experimental psychiatry , volume=. 1994 , publisher=

  24. [24]

    The 41st international ACM SIGIR conference on research & development in information retrieval , pages=

    Texygen: A benchmarking platform for text generation models , author=. The 41st international ACM SIGIR conference on research & development in information retrieval , pages=

  25. [25]

    Frontiers in Artificial Intelligence , volume=

    Computational hermeneutics: evaluating generative AI as a cultural technology , author=. Frontiers in Artificial Intelligence , volume=. 2026 , publisher=

  26. [26]

    Advances in neural information processing systems , volume=

    In-context impersonation reveals large language models' strengths and biases , author=. Advances in neural information processing systems , volume=

  27. [27]

    Journal of medical Internet research , volume=

    Prompt engineering as an important emerging skill for medical professionals: tutorial , author=. Journal of medical Internet research , volume=. 2023 , publisher=

  28. [28]

    2019 , month = oct, howpublished =

    Erica Russell , title =. 2019 , month = oct, howpublished =

  29. [29]

    The journal of positive psychology , volume=

    Thematic analysis , author=. The journal of positive psychology , volume=. 2017 , publisher=