AI Tokenomics: The Economics of Tokens, Computation, and Pricing in Foundation Models

Quanyan Zhu

arxiv: 2606.24616 · v1 · pith:I7EFM66Unew · submitted 2026-06-10 · 💻 cs.AI · cs.PF· econ.GN· q-fin.EC

AI Tokenomics: The Economics of Tokens, Computation, and Pricing in Foundation Models

Quanyan Zhu This is my paper

Pith reviewed 2026-06-27 09:57 UTC · model grok-4.3

classification 💻 cs.AI cs.PFecon.GNq-fin.EC

keywords AI tokenomicsfoundation modelstoken pricingeconomic valuecomputation costsworkflow optimizationresource allocation

0 comments

The pith

Token expenditure in foundation models does not equal economic value because productivity, workflow position, and downstream effects determine worth.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework called AI tokenomics to study how tokens function as the basic accounting unit across information processing, computation, energy use, and pricing in large AI systems. It links token-level technical costs to higher-level enterprise resource decisions and market questions. The central distinction is that spending tokens does not automatically create value; value instead arises from marginal productivity, where in the workflow the tokens are used, any hidden reasoning steps, associated risks, and how effects propagate downstream. A reader would care because current pricing and allocation practices often treat token counts as direct proxies for worth, which this view challenges. The paper ends by listing open questions in measurement and dynamic allocation that follow from treating tokens this way.

Core claim

Tokens serve as the practical accounting unit that connects information processing, computation, memory, energy, pricing, and economic value in foundation model services. The framework shows token expenditure and economic value are distinct quantities: value depends on marginal productivity, workflow position, hidden reasoning activity, risk, and downstream propagation effects rather than raw token counts alone.

What carries the argument

The AI tokenomics framework that treats tokens as the linking unit between technical costs and workflow-level production functions.

If this is right

Pricing mechanisms for foundation model services should incorporate marginal productivity and workflow position rather than token volume alone.
Enterprise resource allocation needs separate instrumentation for hidden reasoning activity and downstream propagation effects.
Market design for token-based AI services must account for risk and uncertainty in value realization.
Measurement methods must expand beyond visible token counts to capture hidden activity.
Optimization of token use requires dynamic allocation rules tied to productivity estimates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Enterprises could test the framework by running identical prompts at different workflow stages and comparing output value metrics against token spend.
If value depends on propagation effects, token markets might develop secondary trading for high-productivity positions in multi-agent workflows.
Calibration of productivity functions could lead to new contracts where payment is based on verified downstream outcomes rather than upfront token budgets.

Load-bearing premise

Tokens can serve as a sufficient accounting unit that directly connects information processing, computation, energy, pricing, and economic value across AI systems.

What would settle it

An empirical study that measures value created by different tasks and finds it correlates almost perfectly with raw token expenditure regardless of workflow position or hidden reasoning steps.

Figures

Figures reproduced from arXiv: 2606.24616 by Quanyan Zhu.

read the original abstract

Tokens have become the practical accounting unit for modern foundation model services, linking information processing, computation, memory use, energy expenditure, pricing, and economic value. This paper develops a framework for AI tokenomics: the study of how tokens are generated, consumed, priced, allocated, and optimized across AI systems. We connect token-level technical costs to workflow-level production functions, enterprise resource allocation, measurement and instrumentation methods, and emerging market-design questions. The framework shows that token expenditure and economic value are distinct: value depends on marginal productivity, workflow position, hidden reasoning activity, risk, and downstream propagation effects. The paper concludes by identifying open research directions in hidden-token measurement, empirical calibration, token productivity, dynamic allocation, and token-based markets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a high-level conceptual framework paper that names 'AI tokenomics' and links tokens to economic concepts but offers no new derivations, models, or data.

read the letter

The paper's main contribution is organizing existing ideas around tokens as an accounting unit that ties together computation costs, energy use, pricing, and value creation in foundation models. It flags that token expenditure does not equal economic value, since value also depends on marginal productivity, workflow placement, hidden reasoning steps, and downstream effects. It then lists open questions in measurement, calibration, and market design.

What works is the clean mapping from technical token mechanics to production-function style thinking and enterprise allocation. This could give people working on AI service pricing a shared vocabulary and a list of directions worth pursuing.

The limitation is that the piece stays definitional. No equations are derived, no empirical patterns are shown, and the central distinction between expenditure and value is asserted rather than demonstrated with a model or measurement. The abstract (and the limited view available) gives no indication that the full text moves past synthesis.

This is for readers already following AI economics who want a quick map of the terrain. It does not contain the formal grounding or evidence needed to make a strong case for peer review. I would not send it out unless the full manuscript adds concrete analysis or results that are not visible here.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes a conceptual framework for 'AI tokenomics' that positions tokens as the unifying accounting unit linking information processing, computation, memory, energy, pricing, and economic value in foundation models. It connects token-level technical costs to workflow-level production functions and enterprise allocation, and asserts that token expenditure and economic value are distinct, with value depending on marginal productivity, workflow position, hidden reasoning activity, risk, and downstream propagation effects. The paper concludes by listing open research directions in measurement, calibration, productivity, allocation, and markets.

Significance. If substantiated with formal models or data, the framework could help organize thinking about resource measurement and optimization in AI services by bridging technical costs and economic value. As presented, its primary contribution is conceptual mapping rather than new derivations, predictions, or measurements.

major comments (1)

[Abstract] Abstract: The central claim that 'token expenditure and economic value are distinct' (with value depending on marginal productivity, hidden reasoning, etc.) is asserted at a definitional level but is not supported by any production function, formal model, illustrative calculation, or empirical example showing how these factors cause value to diverge from token counts.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. The manuscript is a conceptual framework paper whose primary contribution is organizing concepts at the intersection of technical costs and economic value in foundation models. We address the major comment below and will revise accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'token expenditure and economic value are distinct' (with value depending on marginal productivity, hidden reasoning, etc.) is asserted at a definitional level but is not supported by any production function, formal model, illustrative calculation, or empirical example showing how these factors cause value to diverge from token counts.

Authors: We agree that the paper is conceptual rather than providing formal derivations or data. The distinction is introduced as a definitional feature of the tokenomics framework, following directly from the workflow-level production function and allocation concepts developed in the body (linking token costs to marginal productivity, hidden activity, risk, and propagation). No production function, calculation, or example is supplied. In revision we will add a short illustrative example in the abstract and main text demonstrating divergence (e.g., a workflow where hidden reasoning tokens yield higher downstream value than raw token count would predict). revision: yes

Circularity Check

0 steps flagged

No significant circularity; definitional framework only

full rationale

The paper advances a conceptual framework mapping token usage to economic ideas (production functions, marginal productivity, hidden reasoning) without any equations, fitted parameters, quantitative predictions, or theorems. The central distinction between token expenditure and economic value is asserted at the definitional level rather than derived. No self-citations, ansatzes, or uniqueness claims appear as load-bearing elements. The derivation chain is empty by inspection; the work is self-contained as an exploratory proposal with no internal reductions to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review based solely on abstract; no specific free parameters, axioms, or invented entities are detailed in the provided text.

pith-pipeline@v0.9.1-grok · 5655 in / 975 out tokens · 14236 ms · 2026-06-27T09:57:39.220032+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 10 canonical work pages · 5 internal anchors

[1]

Acemoglu

D. Acemoglu. Harms of AI. Working Paper 29247, National Bureau of Economic Research,
[2]

Akash network: Decentralized cloud infrastructure.https://akash

Akash Network. Akash network: Decentralized cloud infrastructure.https://akash. network/whitepapers/, 2020. (Cited on p. 19)

2020
[3]

M. A. Al Bari and Q. Zhu. A gestalt game-theoretic framework for designing agentic AI workflows in cyber deception. InInternational Conference on Game Theory and AI for Security, pages 228–248, Cham, 2025. Springer Nature Switzerland. (Cited on pp. 20, 21, 22)

2025
[4]

Qwen API pricing.https://help.aliyun.com, 2026

Alibaba Cloud. Qwen API pricing.https://help.aliyun.com, 2026. Accessed: 2026-06-

2026
[5]

Claude API pricing.https://platform.claude.com/docs/en/ about-claude/pricing, 2026

Anthropic. Claude API pricing.https://platform.claude.com/docs/en/ about-claude/pricing, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13, 14)

2026
[6]

Context windows.https://platform.claude.com/docs/en/ build-with-claude/context-windows, 2026

Anthropic. Context windows.https://platform.claude.com/docs/en/ build-with-claude/context-windows, 2026. Accessed 2026-06-09. (Cited on pp. 10, 11)

2026
[7]

Plans and pricing.https://claude.com/pricing, 2026

Anthropic. Plans and pricing.https://claude.com/pricing, 2026. Accessed 2026-06-09. (Cited on p. 15)

2026
[8]

L. Bai, Z. Huang, X. Wang, J. Sun, R. Mihalcea, E. Brynjolfsson, A. Pentland, and J. Pei. How do AI agents spend your money? analyzing and predicting token consumption in agentic coding tasks.arXiv preprint arXiv:2604.22750, 2026. (Cited on pp. 10, 11, 12)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[9]

Eliciting Latent Predictions from Transformers with the Tuned Lens

N. Belrose, Z. Furman, L. Smith, D. Halawi, I. Ostrovsky, L. McKinney, S. Biderman, and J. Steinhardt. Eliciting latent predictions from transformers with the tuned lens.arXiv preprint arXiv:2303.08112, 2023. (Cited on p. 16) 35

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

LLM pricing comparison dashboard.https://benchlm.ai/llm-pricing,

BenchLM. LLM pricing comparison dashboard.https://benchlm.ai/llm-pricing,
[11]

(Cited on p

Accessed: 2026-06-09. (Cited on p. 13)

2026
[12]

D. P . Bertsekas.Network Optimization: Continuous and Discrete Models. Athena Scientific,
[13]

(Cited on pp. 22, 24)
[14]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P . Dhariwal, A. Neelakantan, P . Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. InAdvances in Neural Information Processing Systems, volume 33, pages 1877–1901, 2020. (Cited on pp. 1, 7, 14)

1901
[15]

Z. S. Chen and Q. Zhu. A theory of multilevel interactive equilibrium in NeuroAI.arXiv preprint arXiv:2605.10505, 2026. (Cited on p. 19)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

LLM API cost comparison.https://costgoat.com/compare/llm-api, 2026

CostGoat. LLM API cost comparison.https://costgoat.com/compare/llm-api, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026
[17]

T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré. Flashattention: Fast and memory-efficient exact attention with io-awareness. InAdvances in Neural Information Processing Systems, volume 35, pages 16344–16359, 2022. (Cited on pp. 8, 16)

2022
[18]

Deepseek API pricing.https://platform.deepseek.com, 2026

DeepSeek AI. Deepseek API pricing.https://platform.deepseek.com, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026
[19]

Elhage, N

N. Elhage, N. Nanda, C. Olsson, T. Henighan, N. Joseph, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, et al. A mathematical framework for transformer circuits.https: //transformer-circuits.pub/2021/framework/index.html, 2021. Transformer Circuits Thread. (Cited on p. 16)

2021
[20]

Ghodsi, M

A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant re- source fairness: Fair allocation of multiple resource types. InProceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, pages 323–336, 2011. (Cited on pp. 18, 19, 22)

2011
[21]

Gemini enterprise agent platform.https://cloud.google.com/ products/gemini-enterprise-agent-platform, 2026

Google Cloud. Gemini enterprise agent platform.https://cloud.google.com/ products/gemini-enterprise-agent-platform, 2026. Accessed 2026-06-09. (Cited on p. 15)

2026
[22]

Gemini API pricing.https://ai.google.dev/gemini-api/docs/ pricing, 2026

Google DeepMind. Gemini API pricing.https://ai.google.dev/gemini-api/docs/ pricing, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13, 14) 36

2026
[23]

Hoffmann, S

J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. de Las Casas, L. A. Hendricks, J. Welbl, A. Clark, et al. Training compute-optimal large language models. InAdvances in Neural Information Processing Systems, volume 35, pages 30016–30030, 2022. (Cited on pp. 4, 5, 9, 16, 23)

2022
[24]

Scaling Laws for Neural Language Models

J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Rad- ford, J. Wu, and D. Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361, 2020. (Cited on pp. 4, 5, 9, 16, 23)

work page internal anchor Pith review Pith/arXiv arXiv 2001
[25]

F. P . Kelly, A. K. Maulloo, and D. K. H. Tan. Rate control for communication networks: Shadow prices, proportional fairness and stability.Journal of the Operational Research Soci- ety, 49(3):237–252, 1998. (Cited on pp. 18, 19, 22, 24)

1998
[26]

Kudo and J

T. Kudo and J. Richardson. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 66–71,

2018
[27]

3, 5, 7)

(Cited on pp. 3, 5, 7)
[28]

ChatOpenAI integration.https://docs.langchain.com/oss/python/ integrations/chat/openai, 2026

LangChain. ChatOpenAI integration.https://docs.langchain.com/oss/python/ integrations/chat/openai, 2026. Accessed 2026-06-09. (Cited on pp. 10, 11)

2026
[29]

Lewis, E

P . Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, and S. Riedel. Retrieval-augmented generation for knowledge- intensive NLP tasks. InAdvances in Neural Information Processing Systems, volume 33, pages 9459–9474, 2020. (Cited on pp. 7, 23)

2020
[30]

Merizzi, T

N. Merizzi, T. Smith, D. Kearns-Manolatos, N. Mittal, and G. Churiwala. The pivot to tokenomics: Navigating AI’s new spend dynamics. Industry report, Deloitte, Jan. 2026. Accessed January 2026. (Cited on p. 2)

2026
[31]

Llama models.https://ai.meta.com/llama, 2026

Meta AI. Llama models.https://ai.meta.com/llama, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026
[32]

Mistral API pricing.https://docs.mistral.ai, 2026

Mistral AI. Mistral API pricing.https://docs.mistral.ai, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026
[33]

R. B. Myerson. Optimal auction design.Mathematics of Operations Research, 6(1):58–73,
[34]

17, 18, 20) 37

(Cited on pp. 17, 18, 20) 37
[35]

Nisan, T

N. Nisan, T. Roughgarden, E. Tardos, and V . V . Vazirani, editors.Algorithmic Game Theory. Cambridge University Press, 2007. (Cited on pp. 17, 18, 20)

2007
[36]

Interpreting GPT: The logit lens.https://www.alignmentforum.org/ posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens, 2020

Nostalgebraist. Interpreting GPT: The logit lens.https://www.alignmentforum.org/ posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens, 2020. Alignment Fo- rum. (Cited on p. 16)

2020
[37]

API pricing.https://openai.com/api/pricing/, 2026

OpenAI. API pricing.https://openai.com/api/pricing/, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13, 14)

2026
[38]

Prompt engineering.https://developers.openai.com/api/docs/guides/ prompt-engineering, 2026

OpenAI. Prompt engineering.https://developers.openai.com/api/docs/guides/ prompt-engineering, 2026. Accessed 2026-06-09. (Cited on p. 10)

2026
[39]

J. S. Park, J. C. O’Brien, C. J. Cai, M. R. Morris, P . Liang, and M. S. Bernstein. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th Annual ACM Symposium on User Interface Software and T echnology, pages 1–22, 2023. (Cited on p. 19)

2023
[40]

Filecoin: A decentralized storage network.https://filecoin.io/ filecoin.pdf, 2017

Protocol Labs. Filecoin: A decentralized storage network.https://filecoin.io/ filecoin.pdf, 2017. (Cited on p. 19)

2017
[41]

Radford, J

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. Language models are unsupervised multitask learners.OpenAI T echnical Report, 2019. (Cited on p. 1)

2019
[42]

V . J. Reddi, C. Cheng, D. Kanter, P . Mattson, G. Schmuelling, C.-J. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou, et al. MLPerf inference benchmark. In2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, pages 446–459,
[43]

Rochet and J

J.-C. Rochet and J. Tirole. Platform competition in two-sided markets.Journal of the Euro- pean Economic Association, 1(4):990–1029, 2003. (Cited on p. 20)

2003
[44]

Schick, J

T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom. Toolformer: Language models can teach themselves to use tools. InAdvances in Neural Information Processing Systems, volume 36, pages 68539–68551,
[45]

4, 10, 15, 19)

(Cited on pp. 4, 10, 15, 19)
[46]

Sennrich, B

R. Sennrich, B. Haddow, and A. Birch. Neural machine translation of rare words with subword units. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 1715–1725, 2016. (Cited on pp. 3, 5, 7) 38

2016
[47]

Shoham and K

Y. Shoham and K. Leyton-Brown.Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, 2008. (Cited on p. 19)

2008
[48]

Strubell, A

E. Strubell, A. Ganesh, and A. McCallum. Energy and policy considerations for deep learning in NLP. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, 2019. (Cited on p. 17)

2019
[49]

Topcuoglu, S

H. Topcuoglu, S. Hariri, and M.-Y. Wu. Performance-effective and low-complexity task scheduling for heterogeneous computing.IEEE T ransactions on Parallel and Distributed Sys- tems, 13(3):260–274, 2002. (Cited on p. 22)

2002
[50]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Sys- tems, volume 30, 2017. (Cited on pp. 1, 8, 14, 16)

2017
[51]

X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou. Self-consistency improves chain of thought reasoning in language models. InInternational Conference on Learning Representations, 2023. (Cited on pp. 8, 10, 16)

2023
[52]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V . Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAd- vances in Neural Information Processing Systems, volume 35, pages 24824–24837, 2022. (Cited on pp. 8, 10, 16)

2022
[53]

Wooldridge.An Introduction to MultiAgent Systems

M. Wooldridge.An Introduction to MultiAgent Systems. Wiley, 2 edition, 2009. (Cited on p. 19)

2009
[54]

C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani, K. Maeng, G. Chang, F. A. Behram, J. Huang, C. Bai, et al. Sustainable AI: Environmental implications, challenges and opportunities. InProceedings of Machine Learning and Systems, volume 4, pages 795– 813, 2022. (Cited on p. 17)

2022
[55]

Grok API pricing.https://docs.x.ai, 2026

xAI. Grok API pricing.https://docs.x.ai, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13)

2026
[56]

Yang and Q

Y.-T. Yang and Q. Zhu. PACT: A contract-theoretic framework for pricing agentic AI ser- vices powered by large language models. InProceedings of the 2025 IEEE Global Communi- cations Conference (GLOBECOM), Taipei, Taiwan, 2025. IEEE. (Cited on pp. 15, 34) 39

2025
[57]

Yang and Q

Y.-T. Yang and Q. Zhu. Internet of agentic AI: Incentive-compatible distributed teaming and workflow.arXiv preprint arXiv:2602.03145, 2026. (Cited on pp. 19, 22)

work page arXiv 2026
[58]

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao. ReAct: Synergizing reasoning and acting in language models. InInternational Conference on Learning Represen- tations, 2023. (Cited on pp. 4, 10, 15, 19)

2023
[59]

Glm API pricing.https://open.bigmodel.cn, 2026

Zhipu AI. Glm API pricing.https://open.bigmodel.cn, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026
[60]

Q. Zhu. Foundations of cyber resilience: The confluence of game, control, and learning theories.arXiv preprint arXiv:2404.01205, 2024. (Cited on pp. 19, 20)

work page arXiv 2024
[61]

Q. Zhu. Game theory meets LLM and agentic AI: Reimagining cybersecurity for the age of intelligent threats.arXiv preprint arXiv:2507.10621, 2025. (Cited on pp. 20, 21)

work page arXiv 2025
[62]

Q. Zhu. Insurance of agentic AI.arXiv preprint arXiv:2606.05449, 2026. (Cited on p. 25)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[63]

Zhu and M

Q. Zhu and M. A. Al Bari. Agentic AI for cyber deception: A gestalt game-theoretic ap- proach to defending against botnet DDoS attacks. InProceedings of the 59th Hawaii Interna- tional Conference on System Sciences, 2026. (Cited on pp. 20, 21, 22)

2026
[64]

Zhu and T

Q. Zhu and T. Ba¸ sar. Revisiting game-theoretic control in socio-technical networks: Emerg- ing design frameworks and contemporary applications.arXiv preprint arXiv:2411.01794,

work page arXiv
[65]

19, 20, 22)

(Cited on pp. 19, 20, 22)
[66]

Zhu and Z

Q. Zhu and Z. Han. Learning, misspecification, and cognitive arbitrage in linear-quadratic network games.arXiv preprint arXiv:2603.17157, 2026. (Cited on pp. 19, 22) 40

work page arXiv 2026

[1] [1]

Acemoglu

D. Acemoglu. Harms of AI. Working Paper 29247, National Bureau of Economic Research,

[2] [2]

Akash network: Decentralized cloud infrastructure.https://akash

Akash Network. Akash network: Decentralized cloud infrastructure.https://akash. network/whitepapers/, 2020. (Cited on p. 19)

2020

[3] [3]

M. A. Al Bari and Q. Zhu. A gestalt game-theoretic framework for designing agentic AI workflows in cyber deception. InInternational Conference on Game Theory and AI for Security, pages 228–248, Cham, 2025. Springer Nature Switzerland. (Cited on pp. 20, 21, 22)

2025

[4] [4]

Qwen API pricing.https://help.aliyun.com, 2026

Alibaba Cloud. Qwen API pricing.https://help.aliyun.com, 2026. Accessed: 2026-06-

2026

[5] [5]

Claude API pricing.https://platform.claude.com/docs/en/ about-claude/pricing, 2026

Anthropic. Claude API pricing.https://platform.claude.com/docs/en/ about-claude/pricing, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13, 14)

2026

[6] [6]

Context windows.https://platform.claude.com/docs/en/ build-with-claude/context-windows, 2026

Anthropic. Context windows.https://platform.claude.com/docs/en/ build-with-claude/context-windows, 2026. Accessed 2026-06-09. (Cited on pp. 10, 11)

2026

[7] [7]

Plans and pricing.https://claude.com/pricing, 2026

Anthropic. Plans and pricing.https://claude.com/pricing, 2026. Accessed 2026-06-09. (Cited on p. 15)

2026

[8] [8]

L. Bai, Z. Huang, X. Wang, J. Sun, R. Mihalcea, E. Brynjolfsson, A. Pentland, and J. Pei. How do AI agents spend your money? analyzing and predicting token consumption in agentic coding tasks.arXiv preprint arXiv:2604.22750, 2026. (Cited on pp. 10, 11, 12)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[9] [9]

Eliciting Latent Predictions from Transformers with the Tuned Lens

N. Belrose, Z. Furman, L. Smith, D. Halawi, I. Ostrovsky, L. McKinney, S. Biderman, and J. Steinhardt. Eliciting latent predictions from transformers with the tuned lens.arXiv preprint arXiv:2303.08112, 2023. (Cited on p. 16) 35

work page internal anchor Pith review Pith/arXiv arXiv 2023

[10] [10]

LLM pricing comparison dashboard.https://benchlm.ai/llm-pricing,

BenchLM. LLM pricing comparison dashboard.https://benchlm.ai/llm-pricing,

[11] [11]

(Cited on p

Accessed: 2026-06-09. (Cited on p. 13)

2026

[12] [12]

D. P . Bertsekas.Network Optimization: Continuous and Discrete Models. Athena Scientific,

[13] [13]

(Cited on pp. 22, 24)

[14] [14]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P . Dhariwal, A. Neelakantan, P . Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. InAdvances in Neural Information Processing Systems, volume 33, pages 1877–1901, 2020. (Cited on pp. 1, 7, 14)

1901

[15] [15]

Z. S. Chen and Q. Zhu. A theory of multilevel interactive equilibrium in NeuroAI.arXiv preprint arXiv:2605.10505, 2026. (Cited on p. 19)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[16] [16]

LLM API cost comparison.https://costgoat.com/compare/llm-api, 2026

CostGoat. LLM API cost comparison.https://costgoat.com/compare/llm-api, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026

[17] [17]

T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré. Flashattention: Fast and memory-efficient exact attention with io-awareness. InAdvances in Neural Information Processing Systems, volume 35, pages 16344–16359, 2022. (Cited on pp. 8, 16)

2022

[18] [18]

Deepseek API pricing.https://platform.deepseek.com, 2026

DeepSeek AI. Deepseek API pricing.https://platform.deepseek.com, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026

[19] [19]

Elhage, N

N. Elhage, N. Nanda, C. Olsson, T. Henighan, N. Joseph, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, et al. A mathematical framework for transformer circuits.https: //transformer-circuits.pub/2021/framework/index.html, 2021. Transformer Circuits Thread. (Cited on p. 16)

2021

[20] [20]

Ghodsi, M

A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant re- source fairness: Fair allocation of multiple resource types. InProceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, pages 323–336, 2011. (Cited on pp. 18, 19, 22)

2011

[21] [21]

Gemini enterprise agent platform.https://cloud.google.com/ products/gemini-enterprise-agent-platform, 2026

Google Cloud. Gemini enterprise agent platform.https://cloud.google.com/ products/gemini-enterprise-agent-platform, 2026. Accessed 2026-06-09. (Cited on p. 15)

2026

[22] [22]

Gemini API pricing.https://ai.google.dev/gemini-api/docs/ pricing, 2026

Google DeepMind. Gemini API pricing.https://ai.google.dev/gemini-api/docs/ pricing, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13, 14) 36

2026

[23] [23]

Hoffmann, S

J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. de Las Casas, L. A. Hendricks, J. Welbl, A. Clark, et al. Training compute-optimal large language models. InAdvances in Neural Information Processing Systems, volume 35, pages 30016–30030, 2022. (Cited on pp. 4, 5, 9, 16, 23)

2022

[24] [24]

Scaling Laws for Neural Language Models

J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Rad- ford, J. Wu, and D. Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361, 2020. (Cited on pp. 4, 5, 9, 16, 23)

work page internal anchor Pith review Pith/arXiv arXiv 2001

[25] [25]

F. P . Kelly, A. K. Maulloo, and D. K. H. Tan. Rate control for communication networks: Shadow prices, proportional fairness and stability.Journal of the Operational Research Soci- ety, 49(3):237–252, 1998. (Cited on pp. 18, 19, 22, 24)

1998

[26] [26]

Kudo and J

T. Kudo and J. Richardson. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 66–71,

2018

[27] [27]

3, 5, 7)

(Cited on pp. 3, 5, 7)

[28] [28]

ChatOpenAI integration.https://docs.langchain.com/oss/python/ integrations/chat/openai, 2026

LangChain. ChatOpenAI integration.https://docs.langchain.com/oss/python/ integrations/chat/openai, 2026. Accessed 2026-06-09. (Cited on pp. 10, 11)

2026

[29] [29]

Lewis, E

P . Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, and S. Riedel. Retrieval-augmented generation for knowledge- intensive NLP tasks. InAdvances in Neural Information Processing Systems, volume 33, pages 9459–9474, 2020. (Cited on pp. 7, 23)

2020

[30] [30]

Merizzi, T

N. Merizzi, T. Smith, D. Kearns-Manolatos, N. Mittal, and G. Churiwala. The pivot to tokenomics: Navigating AI’s new spend dynamics. Industry report, Deloitte, Jan. 2026. Accessed January 2026. (Cited on p. 2)

2026

[31] [31]

Llama models.https://ai.meta.com/llama, 2026

Meta AI. Llama models.https://ai.meta.com/llama, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026

[32] [32]

Mistral API pricing.https://docs.mistral.ai, 2026

Mistral AI. Mistral API pricing.https://docs.mistral.ai, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026

[33] [33]

R. B. Myerson. Optimal auction design.Mathematics of Operations Research, 6(1):58–73,

[34] [34]

17, 18, 20) 37

(Cited on pp. 17, 18, 20) 37

[35] [35]

Nisan, T

N. Nisan, T. Roughgarden, E. Tardos, and V . V . Vazirani, editors.Algorithmic Game Theory. Cambridge University Press, 2007. (Cited on pp. 17, 18, 20)

2007

[36] [36]

Interpreting GPT: The logit lens.https://www.alignmentforum.org/ posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens, 2020

Nostalgebraist. Interpreting GPT: The logit lens.https://www.alignmentforum.org/ posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens, 2020. Alignment Fo- rum. (Cited on p. 16)

2020

[37] [37]

API pricing.https://openai.com/api/pricing/, 2026

OpenAI. API pricing.https://openai.com/api/pricing/, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13, 14)

2026

[38] [38]

Prompt engineering.https://developers.openai.com/api/docs/guides/ prompt-engineering, 2026

OpenAI. Prompt engineering.https://developers.openai.com/api/docs/guides/ prompt-engineering, 2026. Accessed 2026-06-09. (Cited on p. 10)

2026

[39] [39]

J. S. Park, J. C. O’Brien, C. J. Cai, M. R. Morris, P . Liang, and M. S. Bernstein. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th Annual ACM Symposium on User Interface Software and T echnology, pages 1–22, 2023. (Cited on p. 19)

2023

[40] [40]

Filecoin: A decentralized storage network.https://filecoin.io/ filecoin.pdf, 2017

Protocol Labs. Filecoin: A decentralized storage network.https://filecoin.io/ filecoin.pdf, 2017. (Cited on p. 19)

2017

[41] [41]

Radford, J

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. Language models are unsupervised multitask learners.OpenAI T echnical Report, 2019. (Cited on p. 1)

2019

[42] [42]

V . J. Reddi, C. Cheng, D. Kanter, P . Mattson, G. Schmuelling, C.-J. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou, et al. MLPerf inference benchmark. In2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, pages 446–459,

[43] [43]

Rochet and J

J.-C. Rochet and J. Tirole. Platform competition in two-sided markets.Journal of the Euro- pean Economic Association, 1(4):990–1029, 2003. (Cited on p. 20)

2003

[44] [44]

Schick, J

T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom. Toolformer: Language models can teach themselves to use tools. InAdvances in Neural Information Processing Systems, volume 36, pages 68539–68551,

[45] [45]

4, 10, 15, 19)

(Cited on pp. 4, 10, 15, 19)

[46] [46]

Sennrich, B

R. Sennrich, B. Haddow, and A. Birch. Neural machine translation of rare words with subword units. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 1715–1725, 2016. (Cited on pp. 3, 5, 7) 38

2016

[47] [47]

Shoham and K

Y. Shoham and K. Leyton-Brown.Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, 2008. (Cited on p. 19)

2008

[48] [48]

Strubell, A

E. Strubell, A. Ganesh, and A. McCallum. Energy and policy considerations for deep learning in NLP. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, 2019. (Cited on p. 17)

2019

[49] [49]

Topcuoglu, S

H. Topcuoglu, S. Hariri, and M.-Y. Wu. Performance-effective and low-complexity task scheduling for heterogeneous computing.IEEE T ransactions on Parallel and Distributed Sys- tems, 13(3):260–274, 2002. (Cited on p. 22)

2002

[50] [50]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Sys- tems, volume 30, 2017. (Cited on pp. 1, 8, 14, 16)

2017

[51] [51]

X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou. Self-consistency improves chain of thought reasoning in language models. InInternational Conference on Learning Representations, 2023. (Cited on pp. 8, 10, 16)

2023

[52] [52]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V . Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAd- vances in Neural Information Processing Systems, volume 35, pages 24824–24837, 2022. (Cited on pp. 8, 10, 16)

2022

[53] [53]

Wooldridge.An Introduction to MultiAgent Systems

M. Wooldridge.An Introduction to MultiAgent Systems. Wiley, 2 edition, 2009. (Cited on p. 19)

2009

[54] [54]

C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani, K. Maeng, G. Chang, F. A. Behram, J. Huang, C. Bai, et al. Sustainable AI: Environmental implications, challenges and opportunities. InProceedings of Machine Learning and Systems, volume 4, pages 795– 813, 2022. (Cited on p. 17)

2022

[55] [55]

Grok API pricing.https://docs.x.ai, 2026

xAI. Grok API pricing.https://docs.x.ai, 2026. Accessed: 2026-06-09. (Cited on pp. 2, 5, 12, 13)

2026

[56] [56]

Yang and Q

Y.-T. Yang and Q. Zhu. PACT: A contract-theoretic framework for pricing agentic AI ser- vices powered by large language models. InProceedings of the 2025 IEEE Global Communi- cations Conference (GLOBECOM), Taipei, Taiwan, 2025. IEEE. (Cited on pp. 15, 34) 39

2025

[57] [57]

Yang and Q

Y.-T. Yang and Q. Zhu. Internet of agentic AI: Incentive-compatible distributed teaming and workflow.arXiv preprint arXiv:2602.03145, 2026. (Cited on pp. 19, 22)

work page arXiv 2026

[58] [58]

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao. ReAct: Synergizing reasoning and acting in language models. InInternational Conference on Learning Represen- tations, 2023. (Cited on pp. 4, 10, 15, 19)

2023

[59] [59]

Glm API pricing.https://open.bigmodel.cn, 2026

Zhipu AI. Glm API pricing.https://open.bigmodel.cn, 2026. Accessed: 2026-06-09. (Cited on p. 13)

2026

[60] [60]

Q. Zhu. Foundations of cyber resilience: The confluence of game, control, and learning theories.arXiv preprint arXiv:2404.01205, 2024. (Cited on pp. 19, 20)

work page arXiv 2024

[61] [61]

Q. Zhu. Game theory meets LLM and agentic AI: Reimagining cybersecurity for the age of intelligent threats.arXiv preprint arXiv:2507.10621, 2025. (Cited on pp. 20, 21)

work page arXiv 2025

[62] [62]

Q. Zhu. Insurance of agentic AI.arXiv preprint arXiv:2606.05449, 2026. (Cited on p. 25)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[63] [63]

Zhu and M

Q. Zhu and M. A. Al Bari. Agentic AI for cyber deception: A gestalt game-theoretic ap- proach to defending against botnet DDoS attacks. InProceedings of the 59th Hawaii Interna- tional Conference on System Sciences, 2026. (Cited on pp. 20, 21, 22)

2026

[64] [64]

Zhu and T

Q. Zhu and T. Ba¸ sar. Revisiting game-theoretic control in socio-technical networks: Emerg- ing design frameworks and contemporary applications.arXiv preprint arXiv:2411.01794,

work page arXiv

[65] [65]

19, 20, 22)

(Cited on pp. 19, 20, 22)

[66] [66]

Zhu and Z

Q. Zhu and Z. Han. Learning, misspecification, and cognitive arbitrage in linear-quadratic network games.arXiv preprint arXiv:2603.17157, 2026. (Cited on pp. 19, 22) 40

work page arXiv 2026