Next-Billion AI Index: The compass for AI utility and adoption in the global majority

Ambrish Rawat; Claudio Pinhanez; Jessica He; Kush R. Varshney; Rahul Gupta; Rumman Chowdhury; Satyapriya Krishna; Subhabrata Majumdar; Yann Le Beux

arxiv: 2606.00359 · v1 · pith:6ZCRSFO4new · submitted 2026-05-29 · 💻 cs.CY

Next-Billion AI Index: The compass for AI utility and adoption in the global majority

Ambrish Rawat , Jessica He , Subhabrata Majumdar , Claudio Pinhanez , Yann Le Beux , Satyapriya Krishna , Rahul Gupta , Rumman Chowdhury

show 1 more author

Kush R. Varshney

This is my paper

Pith reviewed 2026-06-28 19:39 UTC · model grok-4.3

classification 💻 cs.CY

keywords AI adoptionnext billion usersdiagnostic frameworkeconomic viabilityoperational deployabilitygovernance alignmentgenerative AI evaluationemerging markets

0 comments

The pith

The Next Billion AI Index treats economic viability, operational deployability, and governance alignment as equal factors in judging whether AI systems can be usefully adopted in next-billion-user settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Generative AI evaluations currently emphasize frontier capabilities but often miss whether systems can be sustained, adapted, and trusted in infrastructure-limited and locally grounded environments. The paper introduces the Next Billion AI Index to fill this gap by defining AI utility through three co-equal themes rather than a single performance score. Ten dimensions assess economic viability, practical deployment under constraints, and alignment with local needs and practices. Rubrics rate performance as weak, moderate, or strong on each dimension. Formative interviews with eleven practitioners building AI for these markets indicate the index helps surface adoption trade-offs around cost, usability, reliability, and trust.

Core claim

The paper claims that nexbax is the first diagnostic framework to operationalize the preconditions for useful AI in next-billion-user contexts by organizing ten dimensions under the themes of Effective Efficiency, Operational Practicality, and Societal Integrity, each supplied with explicit rubrics, and that a formative evaluation through eleven semi-structured interviews with founders, developers, and technical practitioners shows the index is useful for reasoning about adoption trade-offs while remaining a diagnostic rather than a universal score of social value.

What carries the argument

The Next Billion AI Index (nexbax), a set of ten dimensions grouped under three themes that together evaluate whether an AI system meets the economic, operational, and societal conditions required for sustainable adoption.

If this is right

AI utility assessments must treat cost, infrastructure fit, and local governance requirements as co-equal with technical performance.
Explicit rubrics for weak, moderate, and strong performance on each dimension make adoption trade-offs visible to practitioners.
The index can guide development choices by highlighting properties such as reliability and trust that shape real-world uptake.
The framework distinguishes artificial useful intelligence from raw capability by focusing on deployability preconditions.
It is positioned as a tool for making inclusive AI deployment more viable rather than as a final social-value metric.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The index could be extended with domain-specific evidence requirements to increase its precision in sectors such as health or agriculture.
Broader validation involving end users and policymakers in addition to developers would test whether the current dimensions capture all relevant local constraints.
Integration with existing capability benchmarks might produce hybrid evaluation protocols that balance performance and adoption readiness.
Repeated application across different regions could reveal whether certain dimensions carry different weights depending on local infrastructure levels.

Load-bearing premise

The ten dimensions, their rubrics, and the results of eleven expert interviews are sufficient to establish the index as a useful diagnostic for the preconditions of AI adoption.

What would settle it

A field study that measures whether teams using the index reach different or more accurate conclusions about which AI systems will achieve sustained adoption in next-billion contexts, compared with teams relying on capability benchmarks alone.

Figures

Figures reproduced from arXiv: 2606.00359 by Ambrish Rawat, Claudio Pinhanez, Jessica He, Kush R. Varshney, Rahul Gupta, Rumman Chowdhury, Satyapriya Krishna, Subhabrata Majumdar, Yann Le Beux.

read the original abstract

Generative AI assessments remain dominated by frontier capability benchmarks that often fail to capture whether systems can be sustainably deployed, adapted, and trusted in locally grounded and infrastructure-constrained settings. This paper introduces the Next Billion AI Index (nexbax), which we believe is the first diagnostic framework to treat economic viability, operational deployability, and governance alignment as co-equal determinants of AI utility in next-billion-user contexts. Rather than treating usefulness as a single outcome, nexbax operationalizes the preconditions for useful AI through 10 dimensions organized under three themes: Effective Efficiency, Operational Practicality, and Societal Integrity. These dimensions assess whether systems are economically viable, deployable under infrastructure and workflow constraints, and aligned with local needs, user expectations, and collaborative development practices. We pair the framework with rubrics for weak, moderate, and strong performance, and conduct a formative expert evaluation through eleven semi-structured interviews with founders, developers, product leaders, and technical practitioners building AI systems for next-billion markets. Participants found the index useful for reasoning about adoption trade-offs and effective at capturing factors shaping real-world AI uptake -- particularly cost, usability, reliability, and trust. They also identified the need for contextual explanations, domain-specific evidence, and broader stakeholder validation. Nexbax is therefore proposed not as a universal score of social value, but as a diagnostic for artificial useful intelligence: a way to make visible the technical, economic, and governance properties that make inclusive AI deployment more viable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

nexbax offers a structured diagnostic for AI in constrained markets but the 11-interview evaluation leaves the co-equal themes claim untested.

read the letter

The paper's core contribution is a new index called nexbax that breaks AI utility into three themes—Effective Efficiency, Operational Practicality, and Societal Integrity—each with specific dimensions and rubrics for weak/moderate/strong performance. It targets the gap between frontier capability tests and what actually matters for deployment in low-infrastructure settings.

The work does a clear job naming the problem: standard benchmarks ignore cost, workflow fit, local trust, and governance realities that determine whether models get used at scale. Defining ten dimensions up front and pairing them with rubrics gives practitioners something concrete to apply, which is more useful than vague calls for "inclusive AI."

The main limitation is the evaluation. The authors ran eleven semi-structured interviews with founders and developers and report that participants found the index helpful for thinking about trade-offs, especially around cost, usability, reliability, and trust. That is fine as a starting point, but the write-up gives no detail on how the dimensions were derived, whether responses were coded against the three themes, or whether feedback was balanced. If most comments clustered on one theme, the claim that the framework treats the three areas as co-equal stays an assertion rather than a result. No larger test or quantitative check is described.

This paper is aimed at teams building or funding AI for next-billion markets who need a shared language for adoption preconditions. Readers already working in those contexts will find the structure easy to try out. It is not yet ready for broad citation because the evidence base is still formative.

I would send it to peer review. The framing is timely and the dimensions are explicit enough that referees can give targeted feedback on both the logic and the validation steps needed next.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Next Billion AI Index (nexbax) as a diagnostic framework for AI utility in next-billion-user contexts. It claims to be the first to treat economic viability, operational deployability, and governance alignment as co-equal determinants, operationalized via 10 dimensions under three themes (Effective Efficiency, Operational Practicality, Societal Integrity), with associated rubrics for weak/moderate/strong performance. The framework is evaluated through a formative study of 11 semi-structured interviews with founders, developers, and practitioners, who reportedly found it useful for adoption trade-offs, particularly around cost, usability, reliability, and trust.

Significance. If the framework can be shown to operationalize the three themes in a balanced way with broader validation, nexbax could address a genuine gap in AI assessment by emphasizing deployability and local alignment over frontier capability benchmarks alone. The approach of pairing dimensions with rubrics is a constructive step toward actionable diagnostics, though the current evidence limits claims of utility.

major comments (2)

[Formative expert evaluation (11 interviews)] The formative evaluation section provides no breakdown of how the 11 interview responses were coded or analyzed with respect to the three themes, nor any details on the derivation process for the 10 dimensions from the stated themes. This makes the central claim that the dimensions operationalize economic viability, operational deployability, and governance alignment as co-equal an untested design assertion rather than a supported outcome of the evaluation.
[Abstract and evaluation description] The abstract asserts that participants found the index 'effective at capturing factors shaping real-world AI uptake' across the themes, but no evidence is presented on whether feedback was balanced across Effective Efficiency, Operational Practicality, and Societal Integrity or whether any theme dominated responses.

minor comments (2)

[Abstract] The claim of being 'the first' diagnostic framework is hedged with 'we believe' but would benefit from a more explicit comparison to related work on AI adoption frameworks in emerging markets.
[Rubrics description] The manuscript would be strengthened by including example applications of the rubrics to specific AI systems to illustrate scoring.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive comments. We address each major comment below and indicate revisions that will be made to improve transparency around the formative evaluation.

read point-by-point responses

Referee: [Formative expert evaluation (11 interviews)] The formative evaluation section provides no breakdown of how the 11 interview responses were coded or analyzed with respect to the three themes, nor any details on the derivation process for the 10 dimensions from the stated themes. This makes the central claim that the dimensions operationalize economic viability, operational deployability, and governance alignment as co-equal an untested design assertion rather than a supported outcome of the evaluation.

Authors: The 10 dimensions were derived via an a priori literature review and iterative expert consultation to ensure coverage of the three themes as co-equal by design; the interviews served as formative validation of relevance rather than a formal test. We agree the manuscript lacks sufficient methodological detail on this process and on how interview notes were thematically reviewed. We will add a methods subsection describing the dimension derivation and the approach to noting theme-related feedback from the semi-structured interviews. revision: yes
Referee: [Abstract and evaluation description] The abstract asserts that participants found the index 'effective at capturing factors shaping real-world AI uptake' across the themes, but no evidence is presented on whether feedback was balanced across Effective Efficiency, Operational Practicality, and Societal Integrity or whether any theme dominated responses.

Authors: We agree the current wording implies balanced coverage without supporting detail. Feedback addressed all three themes (e.g., cost under Effective Efficiency, usability/reliability under Operational Practicality, and trust/local alignment under Societal Integrity), but responses were not systematically tallied. We will revise the abstract to avoid implying quantified balance and will expand the evaluation section to characterize the distribution of comments across themes based on our interview notes. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in framework definition or evaluation

full rationale

The paper defines nexbax directly by positing three co-equal themes (Effective Efficiency, Operational Practicality, Societal Integrity) and organizing 10 dimensions under them, then supplies rubrics and reports formative feedback from 11 interviews. No equations, parameter fits, predictions, or derivations appear. No self-citations are invoked as load-bearing support for uniqueness or ansatzes. The central claim that the framework treats the three determinants as co-equal is an explicit design choice rather than a result derived from data or prior self-referential work. The interviews are described as validation that the index captures relevant factors, not as a statistical reduction that forces the co-equal structure. This is a standard self-contained framework proposal with no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim depends on the domain assumption that the three themes capture the key preconditions for AI utility, with the index itself introduced as a new assessment construct without external falsifiable evidence beyond the described interviews.

axioms (1)

domain assumption Economic viability, operational deployability, and governance alignment are co-equal determinants of AI utility in next-billion-user contexts.
This premise is invoked in the abstract to organize the 10 dimensions and justify the framework.

invented entities (1)

Next Billion AI Index (nexbax) no independent evidence
purpose: Diagnostic framework to assess preconditions for AI utility and adoption.
Newly proposed index without independent external validation or falsifiable predictions outside the 11 interviews.

pith-pipeline@v0.9.1-grok · 5836 in / 1250 out tokens · 35320 ms · 2026-06-28T19:39:15.675354+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

129 extracted references · 19 canonical work pages · 9 internal anchors

[1]

2025 , address =

Human Development Report 2025: A Matter of Choice: People and Possibilities in the Age of AI , institution =. 2025 , address =

2025
[2]

Trust, Attitudes and Use of Artificial Intelligence , year =
[3]

Anthropic Economic Index , year =
[4]

2025 , institution =

Microsoft New Future of Work 2025 , howpublished =. 2025 , institution =

2025
[5]

IBM Chief Scientist Ruchir Puri Makes the Case for Useful Artificial Intelligence , howpublished =
[6]

AI for Viksit Bharat: The Opportunity of Accelerated Economic Growth , institution =
[7]

Continental Artificial Intelligence Strategy , institution =
[8]

and Kyleman, K

Wan, A. and Kyleman, K. and Kapoor, S. and Maslej, N. and Longpre, S. and Xiong, B. and Liang, P. and Bommasani, R. , title =
[9]

Prahalad, C. K. , title =
[10]

Prahalad, C. K. , title =. Journal of Product Innovation Management , year =
[11]

and Muller, M

Borning, A. and Muller, M. , title =. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , year =
[12]

and Perrault, R

Shoham, Y. and Perrault, R. and Brynjolfsson, E. and Clark, J. , title =
[14]

Government AI Readiness Index , howpublished =
[15]

2026 , howpublished =

IndiaAI Mission , author =. 2026 , howpublished =

2026
[16]

2026 , howpublished =

AI Impact Summit 2026 , author =. 2026 , howpublished =

2026
[17]

and Bhattacharya, S

Chopra, A. and Bhattacharya, S. and Salvador, D. and Paul, A. and Wright, T. and Garg, A. and Ahmad, F. and Schwarze, A. C. and Raskar, R. and Balaprakash, P. , title =
[18]

Readiness Assessment Methodology: A Tool of the Recommendation on the Ethics of Artificial Intelligence , institution =
[19]

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

Designing Culturally Aligned AI Systems For Social Good in Non-Western Contexts , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

2026
[20]

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

Collectively Reimagining Artificial Intelligence With Marginalized Communities , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

2026
[21]

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

Street Scenes: Public Appliances for GenAI Video in Informal Settlements , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

2026
[22]

Resources for Measuring Autonomous AI Capabilities , year =
[23]

Task-Completion Time Horizons of Frontier AI Models , year =
[24]

Transactions on Machine Learning Research , issn =

Holistic Evaluation of Language Models , author =. Transactions on Machine Learning Research , issn =. 2023 , url =

2023
[25]

Ethics and Information Technology , volume =

Roberts, Huw , title =. Ethics and Information Technology , volume =. 2024 , doi =

2024
[26]

2023 , howpublished =

2023
[27]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms , author=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=
[28]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

The AI Power Disparity Index: Toward a Compound Measure of AI Actors’ Power to Shape the AI Ecosystem , author=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=
[29]

2026 , month = feb, howpublished =

Paza: Introducing automatic speech recognition benchmarks and models for low-resource languages , author =. 2026 , month = feb, howpublished =

2026
[30]

2026 , howpublished=

PazaBench: A Benchmark for Automatic Speech Recognition on Low Resource Languages , author=. 2026 , howpublished=

2026
[31]

2025 , howpublished =

Vibhasha: The Multilingual Playbook for Large Language Models , author =. 2025 , howpublished =

2025
[32]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , author=

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms , volume=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , author=. 2025 , month=. doi:10.1609/aies.v8i3.36706 , abstractNote=

work page doi:10.1609/aies.v8i3.36706 2025
[33]

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation , author=. arXiv preprint arXiv:2602.16763 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[37]

2018 , month = jan, url =

Empathy Mapping: The First Step in Design Thinking , author =. 2018 , month = jan, url =

2018
[38]

2025 , url =

National Artificial Intelligence Strategy (2025--2029) , author =. 2025 , url =

2025
[39]

2025 , url =

Kenya Artificial Intelligence Strategy 2025--2030 , author =. 2025 , url =

2025
[40]

and Adeleke, F

Adams, R. and Adeleke, F. and Florido, A. and de Magalhães Santos, L. G. and Grossman, N. and Junck, L. and Stone, K. , title =
[41]

2026 , howpublished =

ImpactBench , author =. 2026 , howpublished =

2026
[42]

2025 , howpublished =

Atlas: A Playbook for Cross Cultural AI , author =. 2025 , howpublished =

2025
[43]

California management review , volume=

The mirage of marketing to the bottom of the pyramid: How the private sector can help alleviate poverty , author=. California management review , volume=. 2007 , publisher=

2007
[44]

The experimental approach to development economics , author=. Annu. Rev. Econ. , volume=. 2009 , publisher=

2009
[45]

2026 , publisher=

Poor economics: Rethinking poverty & the ways to end it , author=. 2026 , publisher=

2026
[46]

AI and Ethics , volume=

Exploring AI ethics in global contexts: a culturally responsive, psychologically realist approach , author=. AI and Ethics , volume=. 2025 , publisher=

2025
[47]

Philosophy & Technology , volume =

Mohamed, Shakir and Png, Marie-Therese and Isaac, William , title =. Philosophy & Technology , volume =. 2020 , doi =

2020
[48]

Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , year =

Sambasivan, Nithya and Arnesen, Erin and Hutchinson, Ben and Doshi, Tulsee and Prabhakaran, Vinodkumar , title =. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , year =

2021
[50]

CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring (the Lack of) Cultural Knowledge of LLMs , author=
[51]

2025 , volume =

Localizing AI in the global south , journal =. 2025 , volume =. doi:10.1038/s42256-025-01057-z , note =

work page doi:10.1038/s42256-025-01057-z 2025
[53]

Advances in Neural Information Processing Systems , volume=

The PRISM alignment dataset: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large language models , author=. Advances in Neural Information Processing Systems , volume=
[54]

2025 , url =

ISF Voices 2025: Africa’s Playbook , author =. 2025 , url =

2025
[55]

ICLR Workshop on Practical Machine Learning for Developing Countries , year=

Foundation Model Platforms and Bottom-of-the-Pyramid Innovation , author=. ICLR Workshop on Practical Machine Learning for Developing Countries , year=
[56]

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

Amplifying the voice of youth in africa via text analytics , author=. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=
[60]

2024 , institution=

The Claude 3 model family: Opus, Sonnet, Haiku , author=. 2024 , institution=

2024
[62]

Transactions on Machine Learning Research , year=

Open technical problems in open-weight AI model risk management , author=. Transactions on Machine Learning Research , year=
[65]

2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) , pages=

MLPerf inference benchmark , author=. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) , pages=. 2020 , organization=

2020
[66]

2024 , publisher =

Sasha Luccioni and Boris Gamazaychikov and Emma Strubell and Sara Hooker and Yacine Jernite and Margaret Mitchell and Carole-Jean Wu , title =. 2024 , publisher =

2024
[68]

2023 , journal=

DecodingTrust: A Comprehensive Assessment of Trustworthiness in \ GPT \ Models , author=. 2023 , journal=

2023
[69]

International Conference on Learning Representations , volume=

Bigcodebench: Benchmarking code generation with diverse function calls and complex instructions , author=. International Conference on Learning Representations , volume=
[70]

International Conference on Learning Representations , volume=

Swe-bench: Can language models resolve real-world github issues? , author=. International Conference on Learning Representations , volume=
[72]

Transactions of the Association for Computational Linguistics , volume=

The Flores-101 evaluation benchmark for low-resource and multilingual machine translation , author=. Transactions of the Association for Computational Linguistics , volume=. 2022 , publisher=

2022
[73]

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , pages=

Mteb: Massive text embedding benchmark , author=. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , pages=
[75]

Localizing AI in the global south

2025. Localizing AI in the global south. Nature Machine Intelligence, 7: 675. Editorial

2025
[76]

GPT-4 Technical Report

Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F. L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023
[77]

G.; Grossman, N.; Junck, L.; and Stone, K

Adams, R.; Adeleke, F.; Florido, A.; de Magalhães Santos, L. G.; Grossman, N.; Junck, L.; and Stone, K. 2024. Global Index on Responsible AI 2024 (1st Edition). Technical report, South Africa: Global Center on AI Governance

2024
[78]

H.; Bhattacharya, P.; Brundyn, A.; Casper, J.; Catanzaro, B.; Clay, S.; Cohen, J.; et al

Adler, B.; Agarwal, N.; Aithal, A.; Anh, D. H.; Bhattacharya, P.; Brundyn, A.; Casper, J.; Catanzaro, B.; Clay, S.; Cohen, J.; et al. 2024. Nemotron-4 340b technical report. arXiv preprint arXiv:2406.11704

work page arXiv 2024
[79]

African Union . 2024. Continental Artificial Intelligence Strategy. https://au.int/en/documents/20240809/continental-artificial-intelligence-strategy

work page arXiv 2024
[80]

Anthropic. 2024. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic

2024
[81]

Anthropic . 2026. Anthropic Economic Index. https://www.anthropic.com/research/economic-index-primitives

2026
[82]

Bailey, G.; Kalarikalayil Raju, D.; Pearson, J.; Robinson, S.; and Jones, M. 2026. Street Scenes: Public Appliances for GenAI Video in Informal Settlements. In Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems, 1--22

2026
[83]

V.; and Duflo, E

Banerjee, A. V.; and Duflo, E. 2009. The experimental approach to development economics. Annu. Rev. Econ., 1(1): 151--178

2009
[84]

V.; and Duflo, E

Banerjee, A. V.; and Duflo, E. 2026. Poor economics: Rethinking poverty & the ways to end it. Penguin Random House India Private Limited

2026
[85]

Small Language Models are the Future of Agentic AI

Belcak, P.; Heinrich, G.; Diao, S.; Fu, Y.; Dong, X.; Muralidharan, S.; Lin, Y. C.; and Molchanov, P. 2025. Small language models are the future of agentic ai. arXiv preprint arXiv:2506.02153

work page internal anchor Pith review Pith/arXiv arXiv 2025
[86]

Borning, A.; and Muller, M. 2012. Next Steps for Value Sensitive Design. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

2012
[87]

Casper, S.; O'Brien, K.; Longpre, S.; Seger, E.; Klyman, K.; Bommasani, R.; Nrusimha, A.; Shumailov, I.; Mindermann, S.; Basart, S.; et al. 2025. Open technical problems in open-weight AI model risk management. Transactions on Machine Learning Research

2025
[88]

Chandiramani, A.; Blakeman, A.; Olaoye, A.; Gupta, A.; Somasamudramath, A.; Khattar, A.; Adesoba, A.; Renduchintala, A.; Asif, A.; Agrawal, A.; et al. 2026. Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning. arXiv preprint arXiv:2604.12374

work page internal anchor Pith review Pith/arXiv arXiv 2026
[89]

Y.; Jiang, L.; Lin, B

Chiu, Y. Y.; Jiang, L.; Lin, B. Y.; Park, C. Y.; Li, S. S.; Ravi, S.; Bhatia, M.; Antoniak, M.; Tsvetkov, Y.; Shwartz, V.; et al. 2024. CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring (the Lack of) Cultural Knowledge of LLMs

2024
[90]

F.; Zhu, Q.; and Majumdar, S

Clancy, R. F.; Zhu, Q.; and Majumdar, S. 2025. Exploring AI ethics in global contexts: a culturally responsive, psychologically realist approach. AI and Ethics, 5(6): 6329--6338

2025
[91]

Eiras, F.; Petrov, A.; Vidgen, B.; Schroeder, C.; Pizzati, F.; Elkins, K.; Mukhopadhyay, S.; Bibi, A.; Purewal, A.; Botos, C.; et al. 2024. Risks and opportunities of open-source generative ai. arXiv preprint arXiv:2405.08597

work page arXiv 2024
[92]

URL https://arxiv

Enevoldsen, K.; Chung, I.; Kerboua, I.; Kardos, M.; Mathur, A.; Stap, D.; Gala, J.; Siblini, W.; Krzemi \'n ski, D.; Winata, G. I.; et al. 2025. Mmteb: Massive multilingual text embedding benchmark. arXiv preprint arXiv:2502.13595

work page arXiv 2025
[93]

FMCIDE. 2025. National Artificial Intelligence Strategy (2025--2029). https://fmcide.gov.ng/initiative/nais/

2025
[94]

Ghosh, S.; Frase, H.; Williams, A.; Luger, S.; R \"o ttger, P.; Barez, F.; McGregor, S.; Fricklas, K.; Kumar, M.; Bollacker, K.; et al. 2025. Ailuminate: Introducing v1. 0 of the ai risk and reliability benchmark from mlcommons. arXiv preprint arXiv:2503.05731

work page arXiv 2025
[95]

Gibbons, S. 2018. Empathy Mapping: The First Step in Design Thinking. Nielsen Norman Group

2018

Showing first 80 references.

[1] [1]

2025 , address =

Human Development Report 2025: A Matter of Choice: People and Possibilities in the Age of AI , institution =. 2025 , address =

2025

[2] [2]

Trust, Attitudes and Use of Artificial Intelligence , year =

[3] [3]

Anthropic Economic Index , year =

[4] [4]

2025 , institution =

Microsoft New Future of Work 2025 , howpublished =. 2025 , institution =

2025

[5] [5]

IBM Chief Scientist Ruchir Puri Makes the Case for Useful Artificial Intelligence , howpublished =

[6] [6]

AI for Viksit Bharat: The Opportunity of Accelerated Economic Growth , institution =

[7] [7]

Continental Artificial Intelligence Strategy , institution =

[8] [8]

and Kyleman, K

Wan, A. and Kyleman, K. and Kapoor, S. and Maslej, N. and Longpre, S. and Xiong, B. and Liang, P. and Bommasani, R. , title =

[9] [9]

Prahalad, C. K. , title =

[10] [10]

Prahalad, C. K. , title =. Journal of Product Innovation Management , year =

[11] [11]

and Muller, M

Borning, A. and Muller, M. , title =. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , year =

[12] [12]

and Perrault, R

Shoham, Y. and Perrault, R. and Brynjolfsson, E. and Clark, J. , title =

[13] [14]

Government AI Readiness Index , howpublished =

[14] [15]

2026 , howpublished =

IndiaAI Mission , author =. 2026 , howpublished =

2026

[15] [16]

2026 , howpublished =

AI Impact Summit 2026 , author =. 2026 , howpublished =

2026

[16] [17]

and Bhattacharya, S

Chopra, A. and Bhattacharya, S. and Salvador, D. and Paul, A. and Wright, T. and Garg, A. and Ahmad, F. and Schwarze, A. C. and Raskar, R. and Balaprakash, P. , title =

[17] [18]

Readiness Assessment Methodology: A Tool of the Recommendation on the Ethics of Artificial Intelligence , institution =

[18] [19]

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

Designing Culturally Aligned AI Systems For Social Good in Non-Western Contexts , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

2026

[19] [20]

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

Collectively Reimagining Artificial Intelligence With Marginalized Communities , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

2026

[20] [21]

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

Street Scenes: Public Appliances for GenAI Video in Informal Settlements , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

2026

[21] [22]

Resources for Measuring Autonomous AI Capabilities , year =

[22] [23]

Task-Completion Time Horizons of Frontier AI Models , year =

[23] [24]

Transactions on Machine Learning Research , issn =

Holistic Evaluation of Language Models , author =. Transactions on Machine Learning Research , issn =. 2023 , url =

2023

[24] [25]

Ethics and Information Technology , volume =

Roberts, Huw , title =. Ethics and Information Technology , volume =. 2024 , doi =

2024

[25] [26]

2023 , howpublished =

2023

[26] [27]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms , author=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

[27] [28]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

The AI Power Disparity Index: Toward a Compound Measure of AI Actors’ Power to Shape the AI Ecosystem , author=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

[28] [29]

2026 , month = feb, howpublished =

Paza: Introducing automatic speech recognition benchmarks and models for low-resource languages , author =. 2026 , month = feb, howpublished =

2026

[29] [30]

2026 , howpublished=

PazaBench: A Benchmark for Automatic Speech Recognition on Low Resource Languages , author=. 2026 , howpublished=

2026

[30] [31]

2025 , howpublished =

Vibhasha: The Multilingual Playbook for Large Language Models , author =. 2025 , howpublished =

2025

[31] [32]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , author=

Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms , volume=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , author=. 2025 , month=. doi:10.1609/aies.v8i3.36706 , abstractNote=

work page doi:10.1609/aies.v8i3.36706 2025

[32] [33]

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation , author=. arXiv preprint arXiv:2602.16763 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[33] [37]

2018 , month = jan, url =

Empathy Mapping: The First Step in Design Thinking , author =. 2018 , month = jan, url =

2018

[34] [38]

2025 , url =

National Artificial Intelligence Strategy (2025--2029) , author =. 2025 , url =

2025

[35] [39]

2025 , url =

Kenya Artificial Intelligence Strategy 2025--2030 , author =. 2025 , url =

2025

[36] [40]

and Adeleke, F

Adams, R. and Adeleke, F. and Florido, A. and de Magalhães Santos, L. G. and Grossman, N. and Junck, L. and Stone, K. , title =

[37] [41]

2026 , howpublished =

ImpactBench , author =. 2026 , howpublished =

2026

[38] [42]

2025 , howpublished =

Atlas: A Playbook for Cross Cultural AI , author =. 2025 , howpublished =

2025

[39] [43]

California management review , volume=

The mirage of marketing to the bottom of the pyramid: How the private sector can help alleviate poverty , author=. California management review , volume=. 2007 , publisher=

2007

[40] [44]

The experimental approach to development economics , author=. Annu. Rev. Econ. , volume=. 2009 , publisher=

2009

[41] [45]

2026 , publisher=

Poor economics: Rethinking poverty & the ways to end it , author=. 2026 , publisher=

2026

[42] [46]

AI and Ethics , volume=

Exploring AI ethics in global contexts: a culturally responsive, psychologically realist approach , author=. AI and Ethics , volume=. 2025 , publisher=

2025

[43] [47]

Philosophy & Technology , volume =

Mohamed, Shakir and Png, Marie-Therese and Isaac, William , title =. Philosophy & Technology , volume =. 2020 , doi =

2020

[44] [48]

Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , year =

Sambasivan, Nithya and Arnesen, Erin and Hutchinson, Ben and Doshi, Tulsee and Prabhakaran, Vinodkumar , title =. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , year =

2021

[45] [50]

CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring (the Lack of) Cultural Knowledge of LLMs , author=

[46] [51]

2025 , volume =

Localizing AI in the global south , journal =. 2025 , volume =. doi:10.1038/s42256-025-01057-z , note =

work page doi:10.1038/s42256-025-01057-z 2025

[47] [53]

Advances in Neural Information Processing Systems , volume=

The PRISM alignment dataset: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large language models , author=. Advances in Neural Information Processing Systems , volume=

[48] [54]

2025 , url =

ISF Voices 2025: Africa’s Playbook , author =. 2025 , url =

2025

[49] [55]

ICLR Workshop on Practical Machine Learning for Developing Countries , year=

Foundation Model Platforms and Bottom-of-the-Pyramid Innovation , author=. ICLR Workshop on Practical Machine Learning for Developing Countries , year=

[50] [56]

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

Amplifying the voice of youth in africa via text analytics , author=. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

[51] [60]

2024 , institution=

The Claude 3 model family: Opus, Sonnet, Haiku , author=. 2024 , institution=

2024

[52] [62]

Transactions on Machine Learning Research , year=

Open technical problems in open-weight AI model risk management , author=. Transactions on Machine Learning Research , year=

[53] [65]

2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) , pages=

MLPerf inference benchmark , author=. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) , pages=. 2020 , organization=

2020

[54] [66]

2024 , publisher =

Sasha Luccioni and Boris Gamazaychikov and Emma Strubell and Sara Hooker and Yacine Jernite and Margaret Mitchell and Carole-Jean Wu , title =. 2024 , publisher =

2024

[55] [68]

2023 , journal=

DecodingTrust: A Comprehensive Assessment of Trustworthiness in \ GPT \ Models , author=. 2023 , journal=

2023

[56] [69]

International Conference on Learning Representations , volume=

Bigcodebench: Benchmarking code generation with diverse function calls and complex instructions , author=. International Conference on Learning Representations , volume=

[57] [70]

International Conference on Learning Representations , volume=

Swe-bench: Can language models resolve real-world github issues? , author=. International Conference on Learning Representations , volume=

[58] [72]

Transactions of the Association for Computational Linguistics , volume=

The Flores-101 evaluation benchmark for low-resource and multilingual machine translation , author=. Transactions of the Association for Computational Linguistics , volume=. 2022 , publisher=

2022

[59] [73]

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , pages=

Mteb: Massive text embedding benchmark , author=. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , pages=

[60] [75]

Localizing AI in the global south

2025. Localizing AI in the global south. Nature Machine Intelligence, 7: 675. Editorial

2025

[61] [76]

GPT-4 Technical Report

Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F. L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023

[62] [77]

G.; Grossman, N.; Junck, L.; and Stone, K

Adams, R.; Adeleke, F.; Florido, A.; de Magalhães Santos, L. G.; Grossman, N.; Junck, L.; and Stone, K. 2024. Global Index on Responsible AI 2024 (1st Edition). Technical report, South Africa: Global Center on AI Governance

2024

[63] [78]

H.; Bhattacharya, P.; Brundyn, A.; Casper, J.; Catanzaro, B.; Clay, S.; Cohen, J.; et al

Adler, B.; Agarwal, N.; Aithal, A.; Anh, D. H.; Bhattacharya, P.; Brundyn, A.; Casper, J.; Catanzaro, B.; Clay, S.; Cohen, J.; et al. 2024. Nemotron-4 340b technical report. arXiv preprint arXiv:2406.11704

work page arXiv 2024

[64] [79]

African Union . 2024. Continental Artificial Intelligence Strategy. https://au.int/en/documents/20240809/continental-artificial-intelligence-strategy

work page arXiv 2024

[65] [80]

Anthropic. 2024. The Claude 3 model family: Opus, Sonnet, Haiku. Technical report, Anthropic

2024

[66] [81]

Anthropic . 2026. Anthropic Economic Index. https://www.anthropic.com/research/economic-index-primitives

2026

[67] [82]

Bailey, G.; Kalarikalayil Raju, D.; Pearson, J.; Robinson, S.; and Jones, M. 2026. Street Scenes: Public Appliances for GenAI Video in Informal Settlements. In Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems, 1--22

2026

[68] [83]

V.; and Duflo, E

Banerjee, A. V.; and Duflo, E. 2009. The experimental approach to development economics. Annu. Rev. Econ., 1(1): 151--178

2009

[69] [84]

V.; and Duflo, E

Banerjee, A. V.; and Duflo, E. 2026. Poor economics: Rethinking poverty & the ways to end it. Penguin Random House India Private Limited

2026

[70] [85]

Small Language Models are the Future of Agentic AI

Belcak, P.; Heinrich, G.; Diao, S.; Fu, Y.; Dong, X.; Muralidharan, S.; Lin, Y. C.; and Molchanov, P. 2025. Small language models are the future of agentic ai. arXiv preprint arXiv:2506.02153

work page internal anchor Pith review Pith/arXiv arXiv 2025

[71] [86]

Borning, A.; and Muller, M. 2012. Next Steps for Value Sensitive Design. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

2012

[72] [87]

Casper, S.; O'Brien, K.; Longpre, S.; Seger, E.; Klyman, K.; Bommasani, R.; Nrusimha, A.; Shumailov, I.; Mindermann, S.; Basart, S.; et al. 2025. Open technical problems in open-weight AI model risk management. Transactions on Machine Learning Research

2025

[73] [88]

Chandiramani, A.; Blakeman, A.; Olaoye, A.; Gupta, A.; Somasamudramath, A.; Khattar, A.; Adesoba, A.; Renduchintala, A.; Asif, A.; Agrawal, A.; et al. 2026. Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning. arXiv preprint arXiv:2604.12374

work page internal anchor Pith review Pith/arXiv arXiv 2026

[74] [89]

Y.; Jiang, L.; Lin, B

Chiu, Y. Y.; Jiang, L.; Lin, B. Y.; Park, C. Y.; Li, S. S.; Ravi, S.; Bhatia, M.; Antoniak, M.; Tsvetkov, Y.; Shwartz, V.; et al. 2024. CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring (the Lack of) Cultural Knowledge of LLMs

2024

[75] [90]

F.; Zhu, Q.; and Majumdar, S

Clancy, R. F.; Zhu, Q.; and Majumdar, S. 2025. Exploring AI ethics in global contexts: a culturally responsive, psychologically realist approach. AI and Ethics, 5(6): 6329--6338

2025

[76] [91]

Eiras, F.; Petrov, A.; Vidgen, B.; Schroeder, C.; Pizzati, F.; Elkins, K.; Mukhopadhyay, S.; Bibi, A.; Purewal, A.; Botos, C.; et al. 2024. Risks and opportunities of open-source generative ai. arXiv preprint arXiv:2405.08597

work page arXiv 2024

[77] [92]

URL https://arxiv

Enevoldsen, K.; Chung, I.; Kerboua, I.; Kardos, M.; Mathur, A.; Stap, D.; Gala, J.; Siblini, W.; Krzemi \'n ski, D.; Winata, G. I.; et al. 2025. Mmteb: Massive multilingual text embedding benchmark. arXiv preprint arXiv:2502.13595

work page arXiv 2025

[78] [93]

FMCIDE. 2025. National Artificial Intelligence Strategy (2025--2029). https://fmcide.gov.ng/initiative/nais/

2025

[79] [94]

Ghosh, S.; Frase, H.; Williams, A.; Luger, S.; R \"o ttger, P.; Barez, F.; McGregor, S.; Fricklas, K.; Kumar, M.; Bollacker, K.; et al. 2025. Ailuminate: Introducing v1. 0 of the ai risk and reliability benchmark from mlcommons. arXiv preprint arXiv:2503.05731

work page arXiv 2025

[80] [95]

Gibbons, S. 2018. Empathy Mapping: The First Step in Design Thinking. Nielsen Norman Group

2018