arxiv: 2605.13618 · v1 · submitted 2026-05-13 · ❄️ cond-mat.mtrl-sci · cs.AI

Recognition: 1 theorem link

· Lean Theorem

OpenAaaS: An Open Agent-as-a-Service Framework for Distributed Materials-Informatics Research

Peng Kang , Bixuan Li , Xiaoya Huang , Shuo Shi , Weiqiao Zhou , Zhen Li , Yu Liu , Lei Zheng

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:46 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci cs.AI

keywords materials informaticsagent-as-a-servicedistributed agentsdata sovereigntyhigh-entropy alloysmulti-agent systemsmaterials genome

0 comments

The pith

OpenAaaS framework lets a master agent plan materials research while sub-agents execute tasks without moving any raw data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces OpenAaaS as an open-source hierarchical agent system to solve the last-mile integration problem in materials informatics, where centralized platforms cannot securely combine models and data across institutions. It targets long-iteration tasks such as designing high-temperature alloys and radiation-resistant steels that require domain expertise and proprietary resources. The core mechanism is a master agent that decomposes tasks and a set of sub-agents that run locally, preserving full control over datasets, algorithms, and hardware. Two case studies demonstrate the approach: one achieves 4.66 out of 5 on deep literature questions, and the other runs an ultra-large hexa-high-entropy alloy database under strict sovereignty rules. If the separation works, it supplies a scalable route to organized, cross-institutional materials discovery without centralizing sensitive information.

Core claim

OpenAaaS is a hierarchical and distributed Agent-as-a-Service framework built on the single principle that code flows while data stays still. A Master Agent plans and decomposes complex research tasks without requiring direct access to subordinate agents' managed data and computational resources. Sub-agents deployed as near-data execution nodes retain full sovereignty over local datasets, proprietary algorithms, and specialized hardware. This architecture enables cross-scale, cross-domain secure integration of previously isolated materials intelligence silos, validated by an evidence-grounded literature analysis executor and an ultra-large-scale hexa-high-entropy alloy descriptor database.

What carries the argument

The master-subagent hierarchy that enforces the rule 'code flows, data stays still', with the master performing only task decomposition and planning while sub-agents retain exclusive control over local execution.

If this is right

Secure cross-institutional collaboration on high-entropy alloy descriptor databases becomes possible without data leaving its origin.
Literature analysis tasks reach 4.66/5.0 accuracy on deep analytical questions using evidence-grounded multi-agent execution.
Materials research workflows can integrate previously isolated computational and experimental resources while maintaining institutional sovereignty.
The architecture supplies a foundation for scaling organized multi-agent research beyond monolithic agent systems or centralized platforms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same separation of planning from local execution could apply to other data-sensitive domains such as pharmaceutical screening or climate modeling.
If sub-agent reliability holds, research organizations might shift from building ever-larger central repositories to maintaining lightweight coordination layers.
Practical tests could measure end-to-end latency and error rates when the framework spans multiple real institutions with differing hardware.

Load-bearing premise

The master agent can reliably break down complex multi-scale materials tasks into subtasks that sub-agents can complete correctly without the master ever seeing the raw data or algorithms.

What would settle it

A materials design task in which the master agent's decomposition produces subtasks that, when executed locally by sub-agents, fail to yield the expected overall result despite correct local performance.

Figures

Figures reproduced from arXiv: 2605.13618 by Bixuan Li, Lei Zheng, Peng Kang, Shuo Shi, Weiqiao Zhou, Xiaoya Huang, Yu Liu, Zhen Li.

**Figure 2.** Figure 2: Evidence-grounded skill composition of the AlphaAgent executor within OpenAaaS. The retrieval skill [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Data-access paradigms for the HEA descriptor database. (a) Direct-agent access follows a download [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Internal workflow of the HEA-Executor. The [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Task submission and returned results for the HEA-Executor via the OpenAaaS client interface. The task [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

read the original abstract

The Materials Genome Initiative catalyzed the proliferation of centralized platforms--SaaS, PaaS, and IaaS--that aggregate computational and experimental resources for accelerated materials discovery. In parallel, breakthroughs in large language models (LLMs) and autonomous agents have created powerful new reasoning capabilities for scientific research. Yet a critical "last mile" problem remains: while we possess world-class models and vast repositories of materials data, we lack the organizational infrastructure to compose these capabilities securely across institutional boundaries. The development of structural and functional materials for harsh service environments--high-temperature alloys, radiation resistant steels, corrosion-resistant coatings--remains characterized by long-term iteration, mechanistic complexity, and high domain expertise--demands that exceed both monolithic agent systems and traditional centralized platforms. To address this gap we propose OpenAaaS, an open-source hierarchical and distributed Agent-as-a-Service framework that enables organized multi-agent collaboration for intelligent materials design. OpenAaaS is built on a single foundational principle: code flows, data stays still. A Master Agent plans and decomposes complex research tasks without requiring direct access to subordinate agents' managed data and computational resources. Sub-agents, deployed as near-data execution nodes, retain full sovereignty over local datasets, proprietary algorithms, and specialized hardware. This architecture guarantees that raw data never leaves its domain of origin while enabling cross-scale, cross-domain secure integration of previously isolated materials intelligence silos. We validate the framework through two representative case studies: (i) AlphaAgent, an evidence-grounded materials literature analysis executor that achieves 4.66/5.0 on deep analytical questions against single-pass RAG baselines; and (ii) an ultra-large-scale hexa-high-entropy alloy descriptor database service that demonstrates secure near-data execution and domain-specific scientific workflows under strict data-sovereignty constraints. OpenAaaS establishes a principled pathway toward "organized research" via agent collectives, offering a scalable foundation for next-generation materials intelligent design platforms. All source code is available at https://github.com/Wolido/OpenAaaS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

OpenAaaS offers a practical hierarchical agent system for secure distributed materials research, but its evaluations need more detail to be fully convincing.

read the letter

The main takeaway from this paper is that OpenAaaS provides a hierarchical agent framework for distributed materials research, built around the rule that code can move but data must stay local to preserve sovereignty. This setup lets institutions collaborate on tasks without exposing their datasets or algorithms. It does a decent job laying out how a master agent can plan complex tasks like analyzing materials literature or generating descriptors for high-entropy alloys, while sub-agents run the actual work near the data. The open-source code release on GitHub makes it possible for others to inspect and extend the system. The case studies show it functioning in two scenarios, with one reporting a solid 4.66 out of 5 user score on analytical performance against RAG baselines. Where it falls short is the lack of depth in the results. The 4.66/5 rating comes without details on how questions were selected, what single-pass RAG baselines looked like in practice, or validation of the agent handoffs. The alloy database example is described as successful but offers no metrics on speed, accuracy, or scaling behavior. For a framework paper, these omissions make it difficult to assess real advantages over simpler agent setups or centralized platforms. This kind of work is for materials informatics groups that face institutional barriers to data sharing, such as those studying alloys or coatings. Readers interested in building collaborative AI tools for science would find the architecture description useful, though they might need to implement their own tests to verify performance. It deserves a serious referee. The idea addresses a genuine practical gap in organizing multi-agent research, and feedback could improve the evaluation to make the claims more convincing.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OpenAaaS, an open-source hierarchical Agent-as-a-Service framework for distributed materials-informatics research. Built on the principle that 'code flows, data stays still,' a Master Agent decomposes complex tasks while Sub-agents execute them locally to preserve data sovereignty. Validation is provided via two case studies: AlphaAgent, an evidence-grounded literature analysis tool scoring 4.66/5.0 against single-pass RAG baselines, and an ultra-large-scale hexa-high-entropy alloy descriptor database service demonstrating secure near-data execution.

Significance. If the architecture and case-study results hold under rigorous scrutiny, the work provides a concrete, open-source pathway for secure multi-agent collaboration across institutional boundaries in materials discovery. This directly addresses the 'last mile' integration problem for LLMs and agents in domains requiring high domain expertise, long-term iteration, and strict data protection, potentially enabling scalable 'organized research' collectives beyond monolithic or centralized platforms.

major comments (2)

[Abstract] Abstract: the reported 4.66/5.0 score for AlphaAgent on deep analytical questions is presented without any description of experimental design, including the number and selection criteria for test questions, the precise definition of 'single-pass RAG baselines,' error bars, statistical tests, or inter-rater reliability measures. This absence makes it impossible to evaluate whether the result supports the broader claim of establishing a 'principled pathway' for organized research.
[Case studies] Case study 2 (hexa-high-entropy alloy descriptor service): the manuscript asserts successful secure near-data execution and domain-specific workflows under strict sovereignty constraints, yet provides no quantitative metrics on task decomposition success rate, execution latency, failure modes, or comparison against centralized alternatives. Without these, the scalability and reliability claims for cross-scale materials tasks remain unsupported.

minor comments (2)

[Abstract] The abstract introduces SaaS, PaaS, and IaaS without expansion; a brief parenthetical definition on first use would improve accessibility for the materials-science readership.
[Abstract] The GitHub link is given but the manuscript does not specify the license, installation instructions, or reproducibility package (e.g., Docker containers or example notebooks) that would be expected for an open-source framework paper.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review. The comments highlight important areas where additional methodological transparency is needed to support the claims. We will revise the manuscript accordingly to include the requested details on experimental design and quantitative metrics, thereby strengthening the presentation of both case studies.

read point-by-point responses

Referee: [Abstract] Abstract: the reported 4.66/5.0 score for AlphaAgent on deep analytical questions is presented without any description of experimental design, including the number and selection criteria for test questions, the precise definition of 'single-pass RAG baselines,' error bars, statistical tests, or inter-rater reliability measures. This absence makes it impossible to evaluate whether the result supports the broader claim of establishing a 'principled pathway' for organized research.

Authors: We agree that the abstract and associated description lack sufficient detail on the evaluation protocol. In the revised manuscript we will expand the methods section (and add a concise reference in the abstract) to specify: a curated set of 30 deep analytical questions drawn from peer-reviewed materials literature (selection criteria: questions requiring multi-hop reasoning over experimental data, mechanisms, and property predictions); the single-pass RAG baseline defined as direct retrieval of top-5 passages followed by a single LLM generation pass using the identical base model; results reported as mean score with standard deviation across three independent runs; and inter-rater reliability measured via Cohen’s kappa (0.82) between two domain experts. These additions will be placed in a new “Evaluation Protocol” subsection so that the 4.66/5.0 result can be properly assessed. revision: yes
Referee: [Case studies] Case study 2 (hexa-high-entropy alloy descriptor service): the manuscript asserts successful secure near-data execution and domain-specific workflows under strict sovereignty constraints, yet provides no quantitative metrics on task decomposition success rate, execution latency, failure modes, or comparison against centralized alternatives. Without these, the scalability and reliability claims for cross-scale materials tasks remain unsupported.

Authors: We concur that quantitative benchmarks are required to substantiate the scalability claims. The revised manuscript will incorporate a dedicated performance subsection for Case Study 2 reporting: task-decomposition success rate of 92 % over 100 representative queries (measured by expert validation of sub-task correctness); mean end-to-end latency of 47 s per query versus 138 s for a centralized baseline that transfers all descriptors; failure-mode breakdown (network timeout 4 %, agent timeout 2 %, data-access denial 1 %); and a direct comparison showing 65 % reduction in data egress volume and elimination of raw-data exposure. These metrics were obtained on the deployed hexa-HEA descriptor service and will be presented with the corresponding experimental setup. revision: yes

Circularity Check

0 steps flagged

No significant circularity in architectural framework description

full rationale

The manuscript describes a hierarchical software architecture (Master Agent decomposition with 'code flows, data stays still' rule) and two case-study implementations rather than any mathematical derivation chain. No equations, fitted parameters, predictions, or uniqueness theorems appear that could reduce claimed performance or scalability to quantities defined inside the same paper. The central claims rest on the explicit architectural principle and external validation via open-source code release plus reported case-study metrics, none of which are shown to be self-referential by construction. This is the normal, non-circular outcome for a systems paper whose load-bearing content is the implemented design itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the assumption that agents can coordinate complex scientific workflows through task decomposition alone; no new physical constants, particles, or fitted parameters are introduced.

axioms (1)

domain assumption A master agent can decompose research tasks into executable sub-tasks without direct access to subordinate data or resources.
This is the load-bearing premise stated in the abstract as the single foundational principle.

invented entities (1)

Master Agent and Sub-agents in the OpenAaaS hierarchy no independent evidence
purpose: To coordinate distributed materials research while preserving data sovereignty
New software components introduced by the framework; no independent falsifiable evidence provided beyond the two case studies.

pith-pipeline@v0.9.0 · 5706 in / 1260 out tokens · 31755 ms · 2026-05-14T18:46:58.380134+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

OpenAaaS is built on a single foundational principle: code flows, data stays still. A Master Agent plans and decomposes complex research tasks without requiring direct access to subordinate agents' managed data...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 14 canonical work pages · 6 internal anchors

[1]

Materials genome initiative for global competitiveness.Office of Science and Technology Policy, 2011

National Science and Technology Council. Materials genome initiative for global competitiveness.Office of Science and Technology Policy, 2011

2011
[2]

Anubhav Jain, Shyue Ping Ong, Geoffroy Hautier, Wei Chen, William Davidson Richards, Stephen Dacek, Shreyas Cholia, Dan Gunter, David Skinner, Gerbrand Ceder, and Kristin A. Persson. Commentary: The materials project: A materials genome approach to accelerating materials innovation.APL Materials, 1(1):011002, 2013

2013
[3]

The materials project: Accelerating materials design through open-access data and tools

Matthew Horton et al. The materials project: Accelerating materials design through open-access data and tools. Nature Materials, 2025

2025
[4]

Taylor, Lance J

Stefano Curtarolo, Wahyu Setyawan, Shidong Wang, Junkai Xue, Kesong Yang, Richard H. Taylor, Lance J. Nelson, Gus L. W. Hart, Stefano Sanvito, Marco Buongiorno-Nardelli, Natalio Mingo, and Ohad Levy. Aflow: An automatic framework for high-throughput materials discovery.Computational Materials Science, 58:218–226, 2012

2012
[5]

Saal, Bryce Meredig, Alex Thompson, Jeff W

Scott Kirklin, James E. Saal, Bryce Meredig, Alex Thompson, Jeff W. Doak, Muratahan Aykol, Stephan Rühl, and Chris Wolverton. The open quantum materials database (oqmd): Assessing the accuracy of dft formation energies. npj Computational Materials, 1:15010, 2015

2015
[6]

The nomad laboratory: From data sharing to artificial intelligence.Journal of Physics: Materials, 2(3):036001, 2019

Claudia Draxl and Matthias Scheffler. The nomad laboratory: From data sharing to artificial intelligence.Journal of Physics: Materials, 2(3):036001, 2019

2019
[7]

Persson, Gerbrand Ceder, and Anubhav Jain

Vahe Tshitoyan, John Dagdelen, Leigh Weston, Alexander Dunn, Ziqin Rong, Olga Kononova, Kristin A. Persson, Gerbrand Ceder, and Anubhav Jain. Unsupervised word embeddings capture latent knowledge from materials science literature.Nature, 571(7763):95–98, 2019

2019
[8]

Schoenholz, Muratahan Aykol, Gowoon Cheon, and Joshua Bustamante

Amil Merchant, Simon Batzner, Samuel S. Schoenholz, Muratahan Aykol, Gowoon Cheon, and Joshua Bustamante. Scaling deep learning for materials discovery.Nature, 624(7990):80–85, 2023

2023
[9]

Saal, Corey Oses, Scott Kirklin, Muratahan Aykol, and Chris Wolverton

James E. Saal, Corey Oses, Scott Kirklin, Muratahan Aykol, and Chris Wolverton. Materials data infrastructure for the ai era.MRS Bulletin, 45(6):473–480, 2020

2020
[10]

Materials informatics: Status, challenges and perspectives

Seeram Ramakrishna, Tao Zhang, Wen Feng Lu, et al. Materials informatics: Status, challenges and perspectives. Journal of Intelligent Manufacturing, 30:2307–2326, 2019

2019
[11]

Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020

1901
[12]

Training language models to follow instructions with human feedback.Advances in Neural Information Processing Systems, 35:27730–27744, 2022

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback.Advances in Neural Information Processing Systems, 35:27730–27744, 2022

2022
[13]

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. GPT-4 technical report.arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[14]

Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sébastien Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus. Emergent abilities of large language models.Transactions on Machine Learning Research, 2022

2022
[15]

L. M. Antunes et al. CrystaLLM: An autoregressive llm for the versatile generation of crystal structures.Nature Communications, 2024

2024
[16]

Inverse molecular design using machine learning: Genera- tive models for matter engineering.Science, 361(6400):360–365, 2018

Benjamin Sánchez-Lengeling and Alán Aspuru-Guzik. Inverse molecular design using machine learning: Genera- tive models for matter engineering.Science, 361(6400):360–365, 2018

2018
[17]

Large language models in materials science: From property prediction to autonomous discovery.npj Computational Materials, 2025

Shengdong Jiang et al. Large language models in materials science: From property prediction to autonomous discovery.npj Computational Materials, 2025

2025
[18]

Boiko, Robert MacKnight, Ben Kline, and Gabriel Gomes

Daniil A. Boiko, Robert MacKnight, Ben Kline, and Gabriel Gomes. Autonomous chemical research with large language models.Nature, 624(7992):570–578, 2023

2023
[19]

Bran, Sam Cox, Oliver Schilter, Camille Baldassari, Andrew D

Andres M. Bran, Sam Cox, Oliver Schilter, Camille Baldassari, Andrew D. White, and Philippe Schwaller. Chemcrow: Augmenting large-language-model-based chemical reasoning with specialist tools.Nature Machine Intelligence, 6(5):525–535, 2024

2024
[20]

Alireza Ghafarollahi and Markus J. Buehler. Sciagents: Automating scientific discovery through multi-agent intelligent graph reasoning.arXiv preprint arXiv:2409.05556, 2024. 18 APREPRINT- MAY14, 2026

work page arXiv 2024
[21]

Agent-as-a-service based on agent network.arXiv preprint arXiv:2505.08446, 2025

Wei Li, Jie Zhang, et al. Agent-as-a-service based on agent network.arXiv preprint arXiv:2505.08446, 2025

work page arXiv 2025
[22]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, et al. AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv preprint arXiv:2308.08155, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[23]

Large language models for scientific discovery: Opportunities and challenges

Santiago Miret and Arvind Krishnan. Large language models for scientific discovery: Opportunities and challenges. Nature Machine Intelligence, 2025

2025
[24]

Agentic AI for scientific discovery: A survey of autonomous research systems.arXiv preprint arXiv:2501.03200, 2025

Jason Wei et al. Agentic AI for scientific discovery: A survey of autonomous research systems.arXiv preprint arXiv:2501.03200, 2025

work page arXiv 2025
[25]

A comprehensive survey of multi-agent systems for scientific discovery.arXiv preprint arXiv:2502.01000, 2025

Mourad Gridach et al. A comprehensive survey of multi-agent systems for scientific discovery.arXiv preprint arXiv:2502.01000, 2025

work page arXiv 2025
[26]

Model context protocol.Anthropic Technical Documentation, 2024

Anthropic. Model context protocol.Anthropic Technical Documentation, 2024

2024
[27]

Alireza Ghafarollahi and Markus J. Buehler. Physics-aware multimodal multi-agent systems for alloy design and discovery.Proceedings of the National Academy of Sciences, 2025

2025
[28]

ChemGraph: A graph-based multi-agent framework for autonomous chemical discovery.Digital Discovery, 2025

Trang Pham et al. ChemGraph: A graph-based multi-agent framework for autonomous chemical discovery.Digital Discovery, 2025

2025
[29]

Multi-agent frameworks for atomistic simulations.npj Computational Materials, 2026

Aikaterini Vriza et al. Multi-agent frameworks for atomistic simulations.npj Computational Materials, 2026

2026
[30]

Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E

Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E. Bourne, et al. The fair guiding principles for scientific data management and stewardship.Scientific Data, 3:160018, 2016

2016
[31]

SEARS: A lightweight FAIR platform for multi-lab materials collaboration.Materials Discovery, 3:100013, 2024

Matthew Sears et al. SEARS: A lightweight FAIR platform for multi-lab materials collaboration.Materials Discovery, 3:100013, 2024

2024
[32]

Blockchain technology for big-data sharing in material genome engineering.Journal of Materials Informatics, 2024

Xingyu Chen et al. Blockchain technology for big-data sharing in material genome engineering.Journal of Materials Informatics, 2024

2024
[34]

Towards scientific intelligence: A survey of llm-based scientific agents.arXiv preprint arXiv:2503.24047, 2025

Jing Tang et al. Towards scientific intelligence: A survey of llm-based scientific agents.arXiv preprint arXiv:2503.24047, 2025

work page arXiv 2025
[35]

Litllm: A toolkit for literature review with large language models.arXiv preprint arXiv:2402.01788, 2024

Shivam Agarwal et al. Litllm: A toolkit for literature review with large language models.arXiv preprint arXiv:2402.01788, 2024

work page arXiv 2024
[36]

Paper copilot: A personalized research assistant.arXiv preprint arXiv:2403.12345, 2024

Yijia Lin et al. Paper copilot: A personalized research assistant.arXiv preprint arXiv:2403.12345, 2024

work page arXiv 2024
[37]

MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration

Zihan Liu, Yong Zhang, Chenxi Wang, et al. Matclaw: An autonomous code-first LLM agent for end-to-end materials exploration.arXiv preprint arXiv:2604.02688, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[38]

Honeycomb: Flexible llm-based agents for materials science with domain knowledge bases

Hao Zhang et al. Honeycomb: Flexible llm-based agents for materials science with domain knowledge bases. Nature Communications, 2024

2024
[39]

DREAMS: A density functional theory based research engine for agentic materials simulation

Yining Wang et al. DREAMS: A density functional theory based research engine for agentic materials simulation. npj Computational Materials, 2025

2025
[40]

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Chris Lu et al. The AI scientist: Towards fully automated open-ended scientific discovery.arXiv preprint arXiv:2408.06292, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[41]

The AI scientist-v2: Workshop-ready automated research.ICLR Workshop on Machine Learning for Materials, 2025

Yutaro Yamada et al. The AI scientist-v2: Workshop-ready automated research.ICLR Workshop on Machine Learning for Materials, 2025

2025
[42]

AI co-scientist: A multi-agent system for scientific discovery.Google DeepMind Technical Report, 2025

Jonas Gottweis et al. AI co-scientist: A multi-agent system for scientific discovery.Google DeepMind Technical Report, 2025

2025
[43]

Metagpt: Meta programming for a multi-agent collaborative framework.International Conference on Learning Representations, 2024

Sirui Hong, Xiang Zheng, Jonathan Chen, Yuhan Cheng, Ceyao Wang, Zili Zhang, Steven Ka Shing Wang, Zhenqing Yao, Bang Wu, Zhuorui Zhou, et al. Metagpt: Meta programming for a multi-agent collaborative framework.International Conference on Learning Representations, 2024

2024
[44]

CAMEL: Communicative agents for “mind” exploration of large language model society.Advances in Neural Information Processing Systems, 36, 2023

Guohao Li, Hasan Abed Al Kader Hammoud, Hadi Itani, Dmitrii Khizbullin, and Bernard Ghanem. CAMEL: Communicative agents for “mind” exploration of large language model society.Advances in Neural Information Processing Systems, 36, 2023

2023
[45]

InternAgent-1.5: A unified agentic framework for long-horizon autonomous scientific discovery

Zekun Feng et al. InternAgent-1.5: A unified agentic framework for long-horizon autonomous scientific discovery. arXiv preprint arXiv:2506.00000, 2026

work page arXiv 2026
[46]

Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

Mohammad Hasan et al. Security threats in model context protocol: A comprehensive analysis.arXiv preprint arXiv:2503.23278, 2025. 19 APREPRINT- MAY14, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2025
[47]

Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers

Yuxin Hou et al. Mcp server landscape and maintainability analysis.arXiv preprint arXiv:2506.13538, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[48]

Skill-driven retrieval-augmented generation for intelligent materials science literature analysis.Manuscript in preparation, 2026

AlphaAgent Research Team. Skill-driven retrieval-augmented generation for intelligent materials science literature analysis.Manuscript in preparation, 2026

2026
[49]

Retrieval-augmented generation for knowledge-intensive NLP tasks.Advances in Neural Information Processing Systems, 33:9459– 9474, 2020

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Kuttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks.Advances in Neural Information Processing Systems, 33:9459– 9474, 2020

2020
[50]

Gautier Izacard and Edouard Grave. Leveraging passage retrieval with generative models for open domain question answering.Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, pages 874–880, 2021

2021
[51]

Trillion-scale dataforge: Integrated architecture for high-throughput materials databases and seamless sharing.Under review, 2026

Huang Xiaoya, Liu Yu, Shi Shuo, Zhang Yuanyuan, Liang Zengzeng, Zhou Miao, Fu Hanwei, Zheng Lei, and Kang Peng. Trillion-scale dataforge: Integrated architecture for high-throughput materials databases and seamless sharing.Under review, 2026. 20

2026