pith. sign in

arxiv: 2602.03545 · v2 · pith:TNCU4JMVnew · submitted 2026-02-03 · 💻 cs.AI

Persona Generators: Generating Diverse Synthetic Personas for Arbitrary Contexts

classification 💻 cs.AI
keywords populationsdiversegeneratorspersonasyntheticcontextspersonasacross
0
0 comments X
read the original abstract

Evaluating AI systems that interact with humans requires understanding their behavior across diverse user populations, but collecting representative human data is often expensive or infeasible, particularly for novel technologies or hypothetical future scenarios. Recent work in Generative Agent-Based Modeling has shown that large language models can simulate human-like synthetic personas with high fidelity, accurately reproducing the beliefs and behaviors of specific individuals. However, most approaches require detailed data about target populations and often prioritize density matching (replicating what is most probable) rather than support coverage (spanning what is possible), leaving long-tail behaviors underexplored. We introduce Persona Generators, functions that can produce diverse synthetic populations tailored to arbitrary contexts. We apply an iterative improvement loop based on AlphaEvolve, using large language models as mutation operators to refine our Persona Generator code over hundreds of iterations. The optimization process produces lightweight Persona Generators that can automatically expand small descriptions into populations of diverse synthetic personas that maximize coverage of opinions and preferences along relevant diversity axes. We demonstrate that evolved generators substantially outperform existing baselines across six diversity metrics on held-out contexts, producing populations that span rare trait combinations difficult to achieve in standard LLM outputs.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Beyond Cooperative Simulators: Generating Realistic User Personas for Robust Evaluation of LLM Agents

    cs.AI 2026-05 unverdicted novelty 7.0

    PPol uses LLM-driven evolutionary program search to create diverse human-like user personas for simulators, yielding 33-62% fitness gains and +17% agent task success on retail and airline domains.

  2. Post-training makes large language models less human-like

    cs.CL 2026-05 unverdicted novelty 6.0

    Post-training reduces LLMs' behavioral alignment with humans across families and sizes, with the misalignment increasing in newer generations while persona induction fails to improve individual-level predictions.