pith. sign in

arxiv: 2605.30685 · v1 · pith:YFXZ7ARLnew · submitted 2026-05-29 · 💻 cs.CY · cs.AI· cs.CL· cs.HC

How Early Adopters Used Generative AI Worldwide: Variation by Country Income and Language

Pith reviewed 2026-06-28 20:54 UTC · model grok-4.3

classification 💻 cs.CY cs.AIcs.CLcs.HC
keywords generative AIchatbot usagecountry incomelanguage differencesearly adoptersdigital divideschooling useleisure use
0
0 comments X

The pith

Schooling dominates early generative AI chatbot use in low-income countries while leisure rises with national income, and English prompts are overrepresented where local languages had weaker model support.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes a large set of anonymized interactions with a widely available free AI chatbot to map how early users apply the technology across countries. Schooling tasks form the leading category overall and show a clear negative link to country GDP, whereas leisure uses increase alongside higher national income. English-language exchanges appear more frequently than expected in nations whose primary languages received less effective support from the models available during the study period. The authors note that differences in language performance could shape whether the technology narrows or widens existing divides.

Core claim

Analysis of anonymized chatbot interactions shows schooling as the most common domain in most countries, with a strong inverse association to country-level GDP, while leisure-related use correlates positively with income. English-language interactions are overrepresented in places where predominant languages were not well-served by existing models. The work indicates that improving performance across languages may determine whether the technology expands digital divides or supports leapfrogging.

What carries the argument

Domain classification of chatbot interactions (schooling, leisure, etc.) correlated against country GDP and language prevalence in the dataset.

If this is right

  • Usage domains shift systematically with economic development, favoring education in lower-income settings.
  • Language model quality influences interaction patterns, producing higher English share where native-language support lags.
  • Multilingual performance improvements could alter whether adoption reinforces or reduces global inequalities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • High schooling use in low-income countries suggests AI could serve as an educational supplement if interfaces and content are localized.
  • English overrepresentation may reflect users working around current model limits rather than a preference for the language itself.
  • Developers targeting lower-income markets might prioritize non-English capabilities to match observed demand patterns.

Load-bearing premise

The anonymized dataset of chatbot interactions accurately captures representative usage patterns across countries without substantial selection bias from chatbot availability, user demographics, or model performance differences.

What would settle it

A large-scale representative survey of AI users across income levels and languages that finds no inverse relationship between schooling use and GDP or no overrepresentation of English in linguistically underserved countries.

Figures

Figures reproduced from arXiv: 2605.30685 by Isaac Slaughter, Madeleine I. G. Daepp.

Figure 1
Figure 1. Figure 1: Frequency of Conversation Purposes Relative to Country GDP. Points show the fraction of early adopters in a given [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Concentration of domain use. Lines show the com [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: English language use in AI Conversations relative to estimated English prevalence in the general population, by [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Language Usage Frequency (relative to English) [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Concentration of domain use by country-level income classification. Panels show the CCDF of the share of conver [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
read the original abstract

AI is being used by people globally, but not everyone is using it in the same ways. Using a large-scale dataset of anonymized, de-identified, and privacy-scrubbed interactions with a widely available and free AI chatbot, we empirically characterize differences in early adopters' usage across countries. Schooling is the most common domain of use in most countries, particularly low-income countries, with a strong inverse association evident between schooling and country-level GDP. Leisure-related use, by contrast, is positively associated with country-level income. Language, we find, also shapes use: English-language interactions are overrepresented in places where the predominant languages were not well-served by existing models during the period of the study. Improving performance across languages may be a key factor, our work suggests, in whether this technology expands digital divides or enables leapfrogging.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper uses a large-scale dataset of anonymized interactions with a single free AI chatbot to characterize early global usage patterns, claiming that schooling is the most common domain (especially in low-income countries, with a strong inverse association to country GDP), leisure use is positively associated with income, and English interactions are overrepresented where local languages had poor model support.

Significance. If the country-level associations hold after accounting for data-source limitations, the work would provide concrete empirical grounding for how generative AI adoption varies by economic development and language support, with direct relevance to debates on digital divides versus leapfrogging.

major comments (2)
  1. [Abstract] Abstract: The abstract states clear associations but supplies no information on sample size, statistical controls, domain-classification method, or robustness checks, so it is impossible to judge whether the data actually support the stated claims.
  2. [Data section] Data section: The analysis relies on interactions from one free chatbot without reported checks for selection bias or representativeness across country income levels (e.g., differential internet access, English proficiency, or age/education skews), which are known to correlate with GDP and could confound the schooling-GDP and leisure-GDP associations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight opportunities to strengthen the presentation of our methods and limitations. We address each point below and will incorporate revisions to improve clarity and transparency.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The abstract states clear associations but supplies no information on sample size, statistical controls, domain-classification method, or robustness checks, so it is impossible to judge whether the data actually support the stated claims.

    Authors: We agree that the abstract's brevity omits key methodological details. In the revision, we will expand it to report the total sample size of interactions, briefly describe the domain classification approach (a hybrid of keyword-based rules and supervised classification validated on a held-out set), note the use of population-weighted regressions with controls for internet penetration, and mention that results are robust to alternative country-level specifications and language filtering. revision: yes

  2. Referee: [Data section] Data section: The analysis relies on interactions from one free chatbot without reported checks for selection bias or representativeness across country income levels (e.g., differential internet access, English proficiency, or age/education skews), which are known to correlate with GDP and could confound the schooling-GDP and leisure-GDP associations.

    Authors: This is a valid concern. The revised Data section will include an explicit discussion of selection into the platform, drawing on external benchmarks such as World Bank internet access and English proficiency rates by income group. We will add analyses showing that the observed schooling-income gradient persists in the subset of high-internet-access countries and will qualify all claims as describing usage patterns among early adopters reachable via this free service rather than the full population. revision: yes

Circularity Check

0 steps flagged

Purely observational empirical study; no derivations or self-referential predictions

full rationale

The paper analyzes a dataset of anonymized chatbot interactions to report descriptive patterns: schooling as most common domain (inversely associated with GDP), leisure positively associated with income, and English overrepresentation where models under-served other languages. These are direct empirical observations and correlations from the data, with no equations, fitted parameters renamed as predictions, self-citations as load-bearing premises, or any reduction of claims to inputs by construction. The analysis is self-contained against external benchmarks via the provided interaction logs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two domain assumptions about data quality rather than free parameters or new entities.

axioms (2)
  • domain assumption Chatbot interactions can be reliably and consistently categorized into domains such as schooling and leisure across countries and languages.
    This categorization is required to produce the reported domain shares and their correlations with GDP.
  • domain assumption The observed interactions constitute a representative sample of early-adopter usage in each country.
    Without this, the inverse schooling-GDP association and language effects cannot be generalized beyond the sampled users.

pith-pipeline@v0.9.1-grok · 5683 in / 1336 out tokens · 26229 ms · 2026-06-28T20:54:45.976327+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 6 canonical work pages

  1. [1]

    Appel, R.; McCrory, P.; Tamkin, A.; McCain, M.; Neylon, T.; and Stern, M. 2025. The Anthropic Economic Index report: Uneven geographic and enterprise AI adoption

  2. [2]

    Benjamini, Y.; and Hochberg, Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological), 57(1): 289--300

  3. [3]

    Bick, A.; Blandin, A.; and Deming, D. J. 2026. The rapid adoption of generative AI. Management Science

  4. [4]

    Björkegren, D.; Bredenkamp, A.; and Chia, H. S. 2026. A Roadmap for AI That Speaks the World's Languages. https://www.cgdev.org/blog/roadmap-ai-speaks-worlds-languages. Blog post, accessed 2026-05-09

  5. [5]

    Brynjolfsson, E.; Li, D.; and Raymond, L. 2025. Generative AI at Work. The Quarterly Journal of Economics, 140(2): 889--942

  6. [6]

    B \"u chi, M.; Just, N.; and Latzer, M. 2016. Modeling the second-level digital divide: A five-country study of social differences in Internet use. New Media & Society, 18(11): 2703--2722

  7. [7]

    J.; Hitzig, Z.; Ong, C.; Shan, C

    Chatterji, A.; Cunningham, T.; Deming, D. J.; Hitzig, Z.; Ong, C.; Shan, C. Y.; and Wadman, K. 2025. How People Use ChatGPT . https://www.nber.org/papers/w34255

  8. [8]

    N.; Li, T.; Li, D.; Zhu, B.; Zhang, H.; Jordan, M

    Chiang, W.-L.; Zheng, L.; Sheng, Y.; Angelopoulos, A. N.; Li, T.; Li, D.; Zhu, B.; Zhang, H.; Jordan, M. I.; Gonzalez, J. E.; et al. 2024. Chatbot arena: an open platform for evaluating LLMs by human preference

  9. [9]

    Costa-Gomes, B.; Chen, S.; Hsueh, C.; Morgan, D.; Schoenegger, P.; Shah, Y.; Way, S.; Zhu, Y.; Adeline, T.; Bhaskar, M.; et al. 2025. It's About Time: The Temporal and Modal Dynamics of Copilot Usage. arXiv preprint arXiv:2512.11879

  10. [10]

    Z.; Demirer, M.; Jaffe, S.; Musolff, L.; Peng, S.; and Salz, T

    Cui, K. Z.; Demirer, M.; Jaffe, S.; Musolff, L.; Peng, S.; and Salz, T. 2026. The Effects of Generative AI on High - Skilled Work : Evidence from Three Field Experiments with Software Developers . Management Science

  11. [11]

    Daepp, M. I. G.; and Counts, S. 2025. The Emerging Generative Artificial Intelligence Divide in the United States . Proceedings of the International AAAI Conference on Web and Social Media, 19: 443--456

  12. [12]

    W.; Jaffe, S.; Immorlica, N.; and Stanton, C

    Dillon, E. W.; Jaffe, S.; Immorlica, N.; and Stanton, C. T. 2026. Shifting Work Patterns with Generative AI . American Economic Review: Insights

  13. [13]

    Draxler, F.; Buschek, D.; Tavast, M.; Hämäläinen, P.; Schmidt, A.; Kulshrestha, J.; and Welsch, R. 2023. Gender, Age , and Technology Education Influence the Adoption and Appropriation of LLMs . ArXiv:2310.06556 [cs]

  14. [14]

    Eloundou, T.; Manning, S.; Mishkin, P.; and Rock, D. 2024. GPTs are GPTs : Labor market impact potential of LLMs . Science, 384(6702): 1306--1308

  15. [15]

    Gillespie, N.; Lockey, S.; and Ward, T. 2025. Trust, attitudes and use of artificial intelligence. Report, University of Melbourne,KPMG

  16. [16]

    Goel, S.; Hofman, J.; and Sirer, M. 2012. Who does what on the web: A large-scale study of browsing behavior. In Proceedings of the International AAAI Conference on web and Social Media, 130--137

  17. [17]

    Handa, K.; Stern, M.; Huang, S.; Hong, J.; Durmus, E.; McCain, M.; Yun, G.; Alt, A.; Millar, T.; Tamkin, A.; Leibrock, J.; Ritchie, S.; and Ganguli, D. 2025 a . Introducing anthropic interviewer: What 1,250 professionals told us about working with AI

  18. [18]

    K.; Amodei, D.; Kaplan, J.; Clark, J.; and Ganguli, D

    Handa, K.; Tamkin, A.; McCain, M.; Huang, S.; Durmus, E.; Heck, S.; Mueller, J.; Hong, J.; Ritchie, S.; Belonax, T.; Troy, K. K.; Amodei, D.; Kaplan, J.; Clark, J.; and Ganguli, D. 2025 b . Which Economic Tasks are Performed with AI ? Evidence from Millions of Claude Conversations

  19. [19]

    Hargittai, E.; et al. 2003. The digital divide and what to do about it. New Economy Handbook, 2003: 821--839

  20. [20]

    Huang, S.; Carter, S.; Eaton, J.; Pollack, S.; III, D. C.; Makagiansar, N.; Gonzalez, M.; Carr, S.; Hong, J.; Handa, K.; McCain, M.; Millar, T.; Julapalli, M.; Yun, G.; Alt, A.; Larsson, C.; Leibrock, J.; Gallivan, M.; Sumers, T.; Durmus, E.; Kearney, M.; Shen, J. H.; Clark, J.; Stern, M.; and Ganguli, D. 2026. What 81,000 people want from AI

  21. [21]

    Huang, X.; Zhu, W.; Hu, H.; He, C.; Li, L.; Huang, S.; and Yuan, F. 2025. BenchMAX : A Comprehensive Multilingual Evaluation Suite for Large Language Models . In Christodoulopoulos, C.; Chakraborty, T.; Rose, C.; and Peng, V., eds., Findings of the Association for Computational Linguistics : EMNLP 2025 , 16751--16774. Suzhou, China: Association for Comput...

  22. [22]

    Joshi, P.; Santy, S.; Budhiraja, A.; Bali, K.; and Choudhury, M. 2020. The State and Fate of Linguistic Diversity and Inclusion in the NLP World . In Jurafsky, D.; Chai, J.; Schluter, N.; and Tetreault, J., eds., Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , 6282--6293. Online: Association for Computational Linguistics

  23. [23]

    Kacperski, C.; Ulloa, R.; Bonnay, D.; Kulshrestha, J.; Selb, P.; and Spitz, A. 2025. Characteristics of ChatGPT users from Germany : Implications for the digital divide from web tracking data. PLOS ONE, 20(1): e0309047

  24. [24]

    H.; Imani, A.; Yvon, F.; and Schuetze, H

    Kargaran, A. H.; Imani, A.; Yvon, F.; and Schuetze, H. 2023. GlotLID : Language Identification for Low - Resource Languages . In Bouamor, H.; Pino, J.; and Bali, K., eds., Findings of the Association for Computational Linguistics : EMNLP 2023 , 6155--6218. Singapore: Association for Computational Linguistics

  25. [25]

    The digital divide in generative ai: Evidence from large language model use in college admissions essays

    Lee, J.; Borchers, C.; Alvero, A. J.; Joachims, T.; and Kizilcec, R. F. 2026. The Digital Divide in Generative AI : Evidence from Large Language Model Use in College Admissions Essays . ArXiv:2602.17791 [cs]

  26. [26]

    Lee, K.-F. 2018. AI superpowers: China, Silicon Valley, and the new world order. Harper Business

  27. [27]

    Liu, Y.; and Wang, H. 2026. Who on Earth is using generative AI? World Development, 199: 107260

  28. [28]

    Microsoft AI Economy Institute . 2026 a . Global AI Adoption in 2025: A Widening Digital Divide. https://www.microsoft.com/en-us/research/group/aiei/ai-diffusion/, Accessed January 18, 2026

  29. [29]

    Microsoft AI Economy Institute . 2026 b . Global AI Diffusion in Q1 2026. https://www.microsoft.com/en-us/corporate-responsibility/topics/ai-economy-institute/reports/global-ai-adoption-2026-q1/. Accessed MAy 10, 2026

  30. [30]

    Misra, A.; Wang, J.; McCullers, S.; White, K.; and Ferres, J. L. 2025 a . Measuring AI Diffusion : A Population - Normalized Metric for Tracking Global AI Usage

  31. [31]

    W.; Hamidouche, W.; Becker-Reshef, I.; and Ferres, J

    Misra, A.; Zamir, S. W.; Hamidouche, W.; Becker-Reshef, I.; and Ferres, J. L. 2025 b . AI Diffusion in Low Resource Language Countries. arXiv preprint arXiv:2511.02752

  32. [32]

    Muro, M.; and Liu, S. 2025. The Geography of AI : Which Cities Will Drive the Artificial Intelligence Revolution ?

  33. [33]

    OECD. 2024. Job Creation and Local Economic Development 2024: The Geography of Generative AI . Job Creation and Local Economic Development, 2024

  34. [34]

    Pangakis, N.; and Wolken, S. 2025. Keeping Humans in the Loop : Human - Centered Automated Annotation with Generative AI . Proceedings of the International AAAI Conference on Web and Social Media, 19: 1471--1492

  35. [35]

    M.; Liu, A

    Phang, J.; Lampe, M.; Ahmad, L.; Agarwal, S.; Fang, C. M.; Liu, A. R.; Danry, V.; Lee, E.; Chan, S. W. T.; Pataranutaporn, P.; and Maes, P. 2025. Investigating Affective Use and Emotional Well -being on ChatGPT

  36. [36]

    A.; and Francis, N

    Ramey, V. A.; and Francis, N. 2009. A Century of Work and Leisure . American Economic Journal: Macroeconomics, 1(2): 189--224

  37. [37]

    Redmiles, E. 2018. Net Benefits: Digital Inequities in Social Capital, Privacy Preservation, and Digital Parenting Practices of U.S. Social Media Users. Proceedings of the International AAAI Conference on Web and Social Media, 12(1)

  38. [38]

    Ritchie, S.; van Esch, D.; Okonkwo, U.; Vashishth, S.; and Drummond, E. 2024. LinguaMeta : Unified Metadata for Thousands of Languages . In Calzolari, N.; Kan, M.-Y.; Hoste, V.; Lenci, A.; Sakti, S.; and Xue, N., eds., Proceedings of the 2024 Joint International Conference on Computational Linguistics , Language Resources and Evaluation ( LREC - COLING 20...

  39. [39]

    H.; Singh, S.; Maheshwary, R.; Altomare, M.; Chen, Z.; Haggag, M.; Amayuelas, A.; et al

    Romanou, A.; Foroutan, N.; Sotnikova, A.; Nelaturu, S. H.; Singh, S.; Maheshwary, R.; Altomare, M.; Chen, Z.; Haggag, M.; Amayuelas, A.; et al. 2025. Include: Evaluating multilingual language understanding with regional knowledge. In International Conference on Learning Representations, volume 2025, 83291--83322

  40. [40]

    Shah, C.; White, R.; Andersen, R.; Buscher, G.; Counts, S.; Das, S.; Montazer, A.; Manivannan, S.; Neville, J.; Rangan, N.; Safavi, T.; Suri, S.; Wan, M.; Wang, L.; and Yang, L. 2025. Using Large Language Models to Generate , Validate , and Apply User Intent Taxonomies . ACM Trans. Web, 19(3): 34:1--34:29

  41. [41]

    Shelby, R.; Diaz, F.; and Prabhakaran, V. 2025. Taxonomy of User Needs and Actions . ArXiv:2510.06124 [cs]

  42. [42]

    I.; Ngui, J

    Singh, S.; Romanou, A.; Fourrier, C.; Adelani, D. I.; Ngui, J. G.; Vila-Suero, D.; Limkonchotiwat, P.; Marchisio, K.; Leong, W. Q.; Susanto, Y.; Ng, R.; Longpre, S.; Ruder, S.; Ko, W.-Y.; Bosselut, A.; Oh, A.; Martins, A.; Choshen, L.; Ippolito, D.; Ferrante, E.; Fadaee, M.; Ermis, B.; and Hooker, S. 2025. Global MMLU : Understanding and Addressing Cultur...

  43. [43]

    Smirnov, I. 2018. Predicting PISA Scores from Students’ Digital Traces. Proceedings of the International AAAI Conference on Web and Social Media, 12(1)

  44. [44]

    Tomlinson, K.; Jaffe, S.; Wang, W.; Counts, S.; and Suri, S. 2025. Working with AI : Measuring the Occupational Implications of Generative AI

  45. [45]

    J.; and van Dijk, J

    van Deursen, A. J.; and van Dijk, J. A. 2014. The digital divide shifts to differences in usage. New Media & Society, 16(3): 507--526

  46. [46]

    J.; and van Dijk, J

    van Deursen, A. J.; and van Dijk, J. A. 2019. The first-level digital divide shifts from inequalities in physical access to inequalities in material access. New Media & Society, 21(2): 354--375

  47. [47]

    Van Dijk, J. 2020. The digital divide. John Wiley & Sons

  48. [48]

    van Dijk, J. A. G. M. 2006. Digital divide research, achievements and shortcomings. Poetics, 34(4): 221--235

  49. [49]

    B.; Hmaiti, Y.; Kumar, A.; Kuckreja, K.; et al

    Vayani, A.; Dissanayake, D.; Watawana, H.; Ahsan, N.; Sasikumar, N.; Thawakar, O.; Ademtew, H. B.; Hmaiti, Y.; Kumar, A.; Kuckreja, K.; et al. 2025. All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages. In 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 19565--19575. IEEE

  50. [50]

    Warschauer, M. 2003. Technology and Social Inclusion : Rethinking the Digital Divide . The MIT Press

  51. [51]

    C.; and Tan, B

    Wei, K.-K.; Teo, H.-H.; Chan, H. C.; and Tan, B. C. Y. 2011. Conceptualizing and Testing a Social Cognitive Model of the Digital Divide . Information Systems Research, 22(1): 170--187

  52. [52]

    Wu, D.; Aycock, S.; and Monz, C. 2025. Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 20435--20451

  53. [53]

    Xuan, W.; Yang, R.; Qi, H.; Zeng, Q.; Xiao, Y.; Feng, A.; Liu, D.; Xing, Y.; Wang, J.; Gao, F.; Lu, J.; Jiang, Y.; Li, H.; Li, X.; Yu, K.; Dong, R.; Gu, S.; Li, Y.; Xie, X.; Juefei-Xu, F.; Khomh, F.; Yoshie, O.; Chen, Q.; Teodoro, D.; Liu, N.; Goebel, R.; Ma, L.; Marrese-Taylor, E.; Lu, S.; Iwasawa, Y.; Matsuo, Y.; and Li, I. 2025. MMLU - ProX : A Multili...

  54. [54]

    Yang, J.; Yonack, N.; Zyskowski, K.; Yarats, D.; Ho, J.; and Ma, J. 2025. The Adoption and Usage of AI Agents : Early Evidence from Perplexity . arXiv:2512.07828

  55. [55]

    R.; Dalton, J.; and Radlinski, F

    Zamani, H.; Trippas, J. R.; Dalton, J.; and Radlinski, F. 2023. Conversational Information Seeking . Foundations and Trends in Information Retrieval, 17(3-4): 244--456

  56. [56]

    Zhao, W.; Ren, X.; Hessel, J.; Cardie, C.; Choi, Y.; and Deng, Y. 2024. WildChat: 1M Chat GPT Interaction Logs in the Wild. In The Twelfth International Conference on Learning Representations

  57. [57]

    E.; Stoica, I.; and Zhang, H

    Zheng, L.; Chiang, W.-L.; Sheng, Y.; Li, T.; Zhuang, S.; Wu, Z.; Zhuang, Y.; Li, Z.; Lin, Z.; Xing, E.; Gonzalez, J. E.; Stoica, I.; and Zhang, H. 2023. LMSYS-Chat-1M : A Large-Scale Real-World LLM Conversation Dataset . In The Twelfth International Conference on Learning Representations