pith. machine review for the scientific record. sign in

arxiv: 2604.05943 · v1 · submitted 2026-04-07 · 💻 cs.AI

Recognition: no theorem link

MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

Aleksandr Panov, Alexey Kovalev, Alexey Skrynnik, Anton Andreychuk, Egor Cherepanov, Konstantin Yakovlev, Maria Nesterova, Mikhail Kolosov, Oleg Bulichev

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:41 UTC · model grok-4.3

classification 💻 cs.AI
keywords multi-agent reinforcement learningfoundation modelstransformersoffline reinforcement learningStarCraftGoogle Research FootballPOGEMA
0
0 comments X

The pith

A single transformer model trained offline on expert trajectories reaches competitive performance across three unrelated multi-agent environments without any task-specific tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that one GPT-based model can handle StarCraft Multi-Agent Challenge, Google Research Football, and POGEMA by training on hundreds of millions to a billion expert trajectories and using a shared observation encoder. This setup avoids the usual practice of building a separate model for each new multi-agent problem. If the approach holds, it supports the idea of a general-purpose foundation model for multi-agent reinforcement learning that works across varied observation and action spaces. The results come from direct comparisons to specialized baselines in each environment.

Core claim

MARL-GPT uses offline reinforcement learning on large expert datasets together with a single transformer observation encoder to achieve performance comparable to environment-specific agents in SMACv2, GRF, and POGEMA.

What carries the argument

The single shared transformer-based observation encoder that processes inputs from environments with different observation and action spaces without task-specific adjustments.

If this is right

  • Large-scale offline training on expert data can substitute for custom model design per task.
  • A shared encoder captures transferable multi-agent coordination patterns across domains.
  • Scaling the approach to additional environments could reduce the need for repeated architecture search in MARL.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The model might serve as a starting point for quick adaptation to new multi-agent problems with limited extra data.
  • Combining the offline pre-training with limited online interaction could close remaining gaps to optimal performance.

Load-bearing premise

A single encoder without per-environment changes can still extract useful features from the very different observation formats in these three tasks.

What would settle it

Training the same architecture on a fourth multi-agent environment with substantially different observation structure and measuring whether it still matches specialized baselines.

Figures

Figures reproduced from arXiv: 2604.05943 by Aleksandr Panov, Alexey Kovalev, Alexey Skrynnik, Anton Andreychuk, Egor Cherepanov, Konstantin Yakovlev, Maria Nesterova, Mikhail Kolosov, Oleg Bulichev.

Figure 1
Figure 1. Figure 1: Spider plot demonstrating the relative performance [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The general pipeline begins with training expert policies across diverse MARL environments using domain-appropriate [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the proposed encoding scheme for [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Online fine-tuning results: (a) Battle won on the [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Modular and reconfigurable maze environment [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Waveshare JetBot robotic agent used for maze navi [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Real-world execution of a scenario based on maze [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
read the original abstract

Recent advances in multi-agent reinforcement learning (MARL) have demonstrated success in numerous challenging domains and environments, but typically require specialized models for each task. In this work, we propose a coherent methodology that makes it possible for a single GPT-based model to learn and perform well across diverse MARL environments and tasks, including StarCraft Multi-Agent Challenge, Google Research Football and POGEMA. Our method, MARL-GPT, applies offline reinforcement learning to train at scale on the expert trajectories (400M for SMACv2, 100M for GRF, and 1B for POGEMA) combined with a single transformer-based observation encoder that requires no task-specific tuning. Experiments show that MARL-GPT achieves competitive performance compared to specialized baselines in all tested environments. Thus, our findings suggest that it is, indeed, possible to build a multi-task transformer-based model for a wide variety of (significantly different) multi-agent problems paving the way to the fundamental MARL model (akin to ChatGPT, Llama, Mistral etc. in natural language modeling).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes MARL-GPT, a single GPT-based model for multi-agent reinforcement learning trained via offline RL on large expert trajectory datasets (400M for SMACv2, 100M for GRF, 1B for POGEMA). It employs a shared transformer observation encoder requiring no task-specific tuning and claims competitive performance against specialized baselines across these heterogeneous environments, arguing this demonstrates the feasibility of a foundational multi-task MARL model analogous to NLP foundation models.

Significance. If the empirical results and architectural unification hold under scrutiny, the work would be significant for advancing generalist approaches in MARL. The scale of multi-environment offline training on diverse tasks (unit-based, continuous, and grid observations) represents a concrete step toward unified models, with potential to reduce the need for per-task specialization if the shared encoder truly operates without tuning.

major comments (2)
  1. [Abstract] Abstract: The assertion that MARL-GPT 'achieves competitive performance compared to specialized baselines in all tested environments' provides no quantitative metrics, baseline names, win rates, or statistical tests. This absence makes the central empirical claim impossible to evaluate and is load-bearing for the paper's contribution.
  2. [Methods] Methods (observation encoder): The claim of a 'single transformer-based observation encoder that requires no task-specific tuning' is central to the foundation-model analogy, yet the manuscript supplies no description of the tokenization, embedding, padding, or projection steps that map heterogeneous inputs (SMACv2 unit vectors, GRF continuous states, POGEMA grids) into a common representation. Without this, it is unclear whether the encoder is truly shared and untuned or relies on implicit per-environment components.
minor comments (1)
  1. [Abstract] The parenthetical analogy to 'ChatGPT, Llama, Mistral etc.' in the abstract would benefit from a brief citation or clarification to avoid informal tone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which has helped us identify areas where the manuscript can be strengthened. We address each major comment point by point below and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The assertion that MARL-GPT 'achieves competitive performance compared to specialized baselines in all tested environments' provides no quantitative metrics, baseline names, win rates, or statistical tests. This absence makes the central empirical claim impossible to evaluate and is load-bearing for the paper's contribution.

    Authors: We agree that the abstract would be improved by including concrete quantitative details to support the central claim. In the revised manuscript, we have updated the abstract to reference specific performance metrics (win rates and success rates), the names of the specialized baselines, and pointers to the full results with statistical tests in the Experiments section and tables. revision: yes

  2. Referee: [Methods] Methods (observation encoder): The claim of a 'single transformer-based observation encoder that requires no task-specific tuning' is central to the foundation-model analogy, yet the manuscript supplies no description of the tokenization, embedding, padding, or projection steps that map heterogeneous inputs (SMACv2 unit vectors, GRF continuous states, POGEMA grids) into a common representation. Without this, it is unclear whether the encoder is truly shared and untuned or relies on implicit per-environment components.

    Authors: We acknowledge that the original manuscript lacked sufficient detail on the observation encoder. We have revised the Methods section to include a complete description of the tokenization, embedding, padding, and projection steps that map the heterogeneous observations into a shared representation space. This addition makes explicit that the encoder is a single shared module with no task-specific tuning or per-environment parameters. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical multi-task training on heterogeneous trajectories is self-contained

full rationale

The paper's chain is: collect large expert datasets (400M SMACv2, 100M GRF, 1B POGEMA), train a single transformer observation encoder plus GPT-style policy head via offline RL, then report competitive returns versus specialized baselines. No equation defines a quantity in terms of itself, no fitted hyperparameter is relabeled as an out-of-sample prediction, and no uniqueness theorem or ansatz is imported from the authors' prior work to force the architecture. The shared-encoder claim is supported by the training procedure and test results rather than by construction or self-citation load-bearing.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the transformer architecture itself contains many standard hyperparameters whose values are not reported.

pith-pipeline@v0.9.0 · 5524 in / 1074 out tokens · 39262 ms · 2026-05-10T19:41:48.048350+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, and Alexey Skryn- nik. 2025. Mapf-gpt: Imitation learning for multi-agent pathfinding at scale. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 23126–23134

  2. [2]

    Raunak P Bhattacharyya, Derek J Phillips, Blake Wulfe, Jeremy Morton, Alex Kuefler, and Mykel J Kochenderfer. 2018. Multi-agent imitation learning for driving simulation. In2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1534–1539

  3. [3]

    Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258(2021)

  4. [4]

    Yevgen Chebotar, Quan Vuong, Karol Hausman, Fei Xia, Yao Lu, Alex Irpan, Avi- ral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, et al. 2023. Q-transformer: Scalable offline reinforcement learning via autoregressive q-functions. InConfer- ence on Robot Learning. PMLR, 3909–3928

  5. [5]

    Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch. 2021. Decision transformer: Reinforcement learning via sequence modeling.Advances in neural information processing systems34 (2021), 15084–15097. 4MARL-GPT: https://github.com/Cognitive-AI-Systems/marl-gpt 5NanoGPT: https://...

  6. [6]

    Egor Cherepanov, Aleksei Staroverov, Alexey Kovalev, and Aleksandr Panov

  7. [7]

    InThe Fourteenth Interna- tional Conference on Learning Representations

    Recurrent Action Transformer with Memory. InThe Fourteenth Interna- tional Conference on Learning Representations. https://openreview.net/forum?id= kByN4v0M3e

  8. [8]

    Benjamin Ellis, Jonathan Cook, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob Foerster, and Shimon Whiteson. 2023. SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learn- ing. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36....

  9. [9]

    Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, et al. 2024. Stop regressing: training value functions via classification for scalable deep RL. InProceedings of the 41st International Conference on Machine Learning. 13049–13071

  10. [10]

    Roya Firoozi, Johnathan Tucker, Stephen Tian, Anirudha Majumdar, Jiankai Sun, Weiyu Liu, Yuke Zhu, Shuran Song, Ashish Kapoor, Karol Hausman, et al. 2023. Foundation models in robotics: Applications, challenges, and the future.The International Journal of Robotics Research(2023)

  11. [11]

    Adam Fourney, Gagan Bansal, Hussein Mozannar, Cheng Tan, Eduardo Salinas, Friederike Niedtner, Grace Proebsting, Griffin Bassman, Jack Gerrits, Jacob Alber, et al. 2024. Magentic-one: A generalist multi-agent system for solving complex tasks.arXiv preprint arXiv:2411.04468(2024)

  12. [12]

    Jake Grigsby, Justin Sasek, Samyak Parajuli, Daniel Adebi, Amy Zhang, and Yuke Zhu. 2024. Amago-2: Breaking the multi-task barrier in meta-reinforcement learning with transformers.Advances in Neural Information Processing Systems 37 (2024), 87473–87508

  13. [13]

    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory.Neural computation9, 8 (1997), 1735–1780

  14. [14]

    Siyi Hu, Fengda Zhu, Xiaojun Chang, and Xiaodan Liang. 2021. {UPD}eT: Uni- versal Multi-agent {RL} via Policy Decoupling with Transformers. InInterna- tional Conference on Learning Representations. https://openreview.net/forum?id= v9c7hr9ADKx

  15. [15]

    Chang Huang, Junqiao Zhao, Hongtu Zhou, Hai Zhang, Xiao Zhang, and Chen Ye

  16. [16]

    In2023 IEEE Intelligent Vehicles Symposium (IV)

    Multi-agent Decision-making at Unsignalized Intersections with Reinforce- ment Learning from Demonstrations. In2023 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1–6

  17. [17]

    Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan, and Chrisina Jayne. 2017. Imitation learning: A survey of learning methods.ACM Computing Surveys (CSUR)50, 2 (2017), 1–35

  18. [18]

    Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakr- ishna, Suraj Nair, Rafael Rafailov, Ethan P Foster, Pannag R Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, and Chelsea Finn. 2024. OpenVLA: An Open-Source Vision- Language-Action Model. In8th Annual Conference on Robo...

  19. [19]

    Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conserva- tive q-learning for offline reinforcement learning.Advances in Neural Information Processing Systems33 (2020), 1179–1191

  20. [20]

    Karol Kurach, Anton Raichuk, Piotr Stanczyk, Michal Zajac, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, et al. 2020. Google research football: A novel reinforcement learning environment. InProceedings of the AAAI conference on artificial intelligence, Vol. 34. 4501–4510

  21. [21]

    Hoang M Le, Yisong Yue, Peter Carr, and Patrick Lucey. 2017. Coordinated multi- agent imitation learning. InInternational Conference on Machine Learning. PMLR, 1995–2003

  22. [22]

    Jiaoyang Li, Andrew Tinka, Scott Kiesel, Joseph W Durham, TK Satish Kumar, and Sven Koenig. 2021. Lifelong multi-agent path finding in large-scale warehouses. InProceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021). 11272–11281

  23. [23]

    Wei Li, Shiyi Huang, Ziming Qiu, and Aiguo Song. 2024. GAILPG: Multi-Agent Policy Gradient with Generative Adversarial Imitation Learning.IEEE Transac- tions on Games(2024)

  24. [24]

    Sicong Liu, Yang Shu, Chenjuan Guo, and Bin Yang. 2025. Learning Gener- alizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation. In The Thirteenth International Conference on Learning Representations. https: //openreview.net/forum?id=HR1ujVR0ig

  25. [25]

    Shicheng Liu and Minghui Zhu. 2024. Learning multi-agent behaviors from distributed and streaming demonstrations.Advances in Neural Information Pro- cessing Systems36 (2024)

  26. [26]

    Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erdmann, Yun- Pang Flötteröd, Robert Hilbrich, Leonhard Lücken, Johannes Rummel, Peter Wagner, and Evamarie Wießner. 2018. Microscopic traffic simulation using sumo. In2018 21st international conference on intelligent transportation systems (ITSC). IEEE, 2575–2582

  27. [27]

    Tien Mai, Thanh Nguyen, et al. 2024. Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Games.Advances in Neural Information Processing Systems37 (2024), 84669–84697

  28. [28]

    Linghui Meng, Muning Wen, Chenyang Le, Xiyun Li, Dengpeng Xing, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Yaodong Yang, et al. 2023. Offline pre-trained multi-agent decision transformer.Machine Intelligence Research20, 2 (2023), 233–248

  29. [29]

    2016.A concise introduction to decentralized POMDPs

    Frans A Oliehoek, Christopher Amato, et al . 2016.A concise introduction to decentralized POMDPs. Vol. 1. Springer

  30. [30]

    Hyunwoo Park, Baekryun Seong, and Sang-Ki Ko. 2025. SPECTra: Scalable Multi- Agent Reinforcement Learning with Permutation-Free Networks.arXiv preprint arXiv:2503.11726(2025)

  31. [31]

    Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gómez Colmenarejo, Alexan- der Novikov, Gabriel Barth-maron, Mai Giménez, Yury Sulsky, Jackie Kay, Jost To- bias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, and Nando de Freitas. 2022. A Generalist Agent.Transac...

  32. [32]

    Anian Ruoss, Gregoire Deletang, Sourabh Medapati, Jordi Grau-Moya, Li Kevin Wenliang, Elliot Catt, John Reid, Cannada A Lewis, Joel Veness, and Tim Ge- newein. 2024. Amortized Planning with Large-Scale Transformers: A Case Study on Chess. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems

  33. [33]

    Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Far- quhar, Nantas Nardelli, Tim GJ Rudner, Chia-Man Hung, Philip HS Torr, Jakob Foerster, and Shimon Whiteson. 2019. The StarCraft Multi-Agent Challenge. InProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2186–2188

  34. [34]

    Andy Shih, Stefano Ermon, and Dorsa Sadigh. 2022. Conditional imitation learning for multi-agent games. In2022 17th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 166–175

  35. [35]

    David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershel- vam, Marc Lanctot, et al . 2016. Mastering the game of Go with deep neural networks and tree search.nature529, 7587 (2016), 484–489

  36. [36]

    Alexey Skrynnik, Anton Andreychuk, Anatolii Borzilov, Alexander Chernyavskiy, Konstantin Yakovlev, and Aleksandr Panov. 2025. POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding. InThe Thirteenth International Conference on Learning Representations

  37. [37]

    Jiaming Song, Hongyu Ren, Dorsa Sadigh, and Stefano Ermon. 2018. Multi- agent generative adversarial imitation learning.Advances in neural information processing systems31 (2018)

  38. [38]

    Yan Song, He Jiang, Haifeng Zhang, Zheng Tian, Weinan Zhang, and Jun Wang

  39. [39]

    Boosting studies of multi-agent reinforcement learning on Google re- search football environment: The past, present, and future.arXiv preprint arXiv:2309.12951(2023)

  40. [40]

    Jingwu Tang, Gokul Swamy, Fei Fang, and Steven Wu. 2024. Multi-Agent Im- itation Learning: Value is Easy, Regret is Hard. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems

  41. [41]

    Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, et al. 2024. Octo: An open-source generalist robot policy.arXiv preprint arXiv:2405.12213(2024)

  42. [42]

    Rishi Veerapaneni, Arthur Jakobsson, Kevin Ren, Samuel Kim, Jiaoyang Li, and Maxim Likhachev. 2024. Work Smarter Not Harder: Simple Imitation Learning with CS-PIBT Outperforms Large Scale Imitation Learning for MAPF.arXiv preprint arXiv:2409.14491(2024)

  43. [43]

    Hongwei Wang, Lantao Yu, Zhangjie Cao, and Stefano Ermon. 2021. Multi-agent imitation learning with copulas. InMachine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part I 21. Springer, 139–156

  44. [44]

    Yutong Wang, Bairan Xiang, Shinan Huang, and Guillaume Sartoretti. 2023. SCRIMP: Scalable communication for reinforcement-and imitation-learning- based multi-agent pathfinding. In2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 9301–9308

  45. [45]

    Muning Wen, Jakub Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, and Yaodong Yang. 2022. Multi-agent reinforcement learning is a sequence modeling problem.Advances in Neural Information Processing Systems35 (2022), 16509– 16521

  46. [46]

    Liming Xu, Sara Almahri, Stephen Mak, and Alexandra Brintrup. 2024. Multi- agent systems and foundation models enable autonomous supply chains: Oppor- tunities and challenges.IFAC-PapersOnLine58, 19 (2024), 795–800

  47. [47]

    Fan Yang, Alina Vereshchaka, Changyou Chen, and Wen Dong. 2020. Bayesian multi-type mean field multi-agent imitation learning.Advances in Neural Infor- mation Processing Systems33 (2020), 2469–2478

  48. [48]

    Sherry Yang, Ofir Nachum, Yilun Du, Jason Wei, Pieter Abbeel, and Dale Schuur- mans. 2023. Foundation models for decision making: Problems, methods, and opportunities.arXiv preprint arXiv:2303.04129(2023)

  49. [49]

    Fuxiang Zhang, Chengxing Jia, Yi-Chen Li, Lei Yuan, Yang Yu, and Zongzhang Zhang. 2022. Discovering generalizable multi-agent coordination skills from multi-task offline data. InThe Eleventh International Conference on Learning Representations

  50. [50]

    Yiming Zhang, Kun Yang, Cong Shen, and Dongning Guo. 2025. Multi-Agent Decision Transformer for Power Control in Wireless Networks. In2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1–5. 7 APPENDIX 7.1 Appendix A – Adapt Method to New Environment To use this model in any environment, collect expert data (e.g., using...