Discovering Crystal Structure Prediction Algorithms with an AI Co-Scientist

Kiyoung Seong; Nayoung Kim; Sungsoo Ahn

arxiv: 2606.22866 · v1 · pith:U4FSZZZQnew · submitted 2026-06-22 · 💻 cs.LG · cs.AI

Discovering Crystal Structure Prediction Algorithms with an AI Co-Scientist

Kiyoung Seong , Nayoung Kim , Sungsoo Ahn This is my paper

Pith reviewed 2026-06-26 09:00 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords crystal structure predictiongenerative modelsmasked generative transformerAI co-scientistalgorithm discoverycross-domain transfermaterials science

0 comments

The pith

An AI co-scientist adapts a vision model into MaskGXT to raise crystal structure prediction accuracy on benchmarks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a Human-AI Co-discovery system (HACO) that searches generative modeling approaches across fields and uses sparse human input to adapt promising ones to new scientific tasks. For crystal structure prediction from chemical compositions, HACO selects MaskGIT from vision, reformulates it as a discrete token model of crystals, and adds symmetry tokens, stratified sampling, and coordinate refinement to create MaskGXT. On the MP-20 polymorph split this yields 79.06 percent METRe accuracy against 70.87 percent for the strongest baseline, with top results also on standard MP-20 and MPTS-52 benchmarks. A sympathetic reader would care because the work tests whether cross-domain search plus targeted guidance can accelerate algorithm discovery in settings where validation is cheap and fast.

Core claim

HACO searched across generative modeling methodologies from multiple fields and identified MaskGIT as a promising framework for crystal structure prediction. It instantiated the masked formulation as a discrete token model of crystal structure; guided by sparse high-level human objectives, it added crystallographic symmetry tokens, space group stratified sampling for polymorph coverage, and sub-bin coordinate refinement, yielding MaskGXT. On the MP-20 polymorph split MaskGXT reaches 79.06 percent METRe accuracy compared with 70.87 percent for the strongest evaluated baseline and attains the best match rate on standard MP-20 and MPTS-52 CSP benchmarks.

What carries the argument

Human-AI Co-discovery system (HACO) that performs cross-domain search of generative models followed by sparse human steering to adapt them, instantiated here as the Masked Generative Crystal Transformer (MaskGXT).

If this is right

MaskGXT sets the highest reported match rate on the MP-20 polymorph split and on the standard MP-20 and MPTS-52 CSP benchmarks.
Transfer of masked generative modeling principles from vision, when combined with domain-specific tokens and sampling, improves coverage of polymorphs in crystal generation.
In scientific domains that supply cheap, fast, and well-aligned validation metrics, interactive AI systems can identify transferable modeling ideas and combine them with targeted human guidance.
The results supply evidence that cross-domain search plus sparse steering can contribute to scientific algorithm discovery.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same search-and-steer loop could be tested on other discrete-structure generation tasks such as molecular conformer prediction where fast validation oracles exist.
If the performance edge persists across multiple independent re-implementations, it would strengthen the case that the co-discovery workflow itself, rather than any single added token, drives the improvement.
Extending HACO to propose and evaluate several candidate adaptations in parallel might further reduce the human steering needed per task.

Load-bearing premise

The accuracy gains come from the cross-domain search and sparse human steering rather than from routine hyperparameter tuning or standard transformer implementation choices.

What would settle it

Re-implementing the MaskGIT-to-crystal adaptation using only conventional machine-learning engineering without the HACO search process or the listed human-guided additions, then checking whether the 79.06 percent METRe accuracy on the MP-20 polymorph split is still reached.

read the original abstract

We introduce Human-AI Co-discovery system (HACO) for scientific algorithm discovery through cross-domain search and sparse human steering. Starting from the goal of generating crystal structures from chemical compositions, HACO searched across generative modeling methodologies from multiple fields and identified MaskGIT, a masked generative model from vision, as a promising framework for crystal structure prediction (CSP). HACO instantiated this masked formulation as a discrete token model of crystal structure; guided by sparse high-level human objectives, it then added crystallographic symmetry tokens, space group stratified sampling for polymorph coverage, and sub-bin coordinate refinement, yielding the Masked Generative Crystal Transformer (MaskGXT). On the MP-20 polymorph split, MaskGXT reaches 79.06% match-everyone-to-reference (METRe) accuracy, compared with 70.87% for the strongest evaluated baseline. MaskGXT also attains the best match rate on standard MP-20 and MPTS-52 CSP benchmarks. These results provide evidence that, in domains offering cheap, fast, and well-aligned validation, transfer-guided interactive AI co-scientists can contribute to scientific algorithm discovery by identifying transferable modeling principles and combining them with targeted human domain guidance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MaskGXT posts solid benchmark gains on CSP but the HACO process gets credit without ablations to back the causal claim.

read the letter

The paper's core result is a new model, MaskGXT, that reaches 79.06% METRe on the MP-20 polymorph split and leads on the standard MP-20 and MPTS-52 sets. That is a clear numerical step up from the 70.87% strongest baseline they report.

They start with MaskGIT from vision, cast crystal structures as discrete tokens, then layer in symmetry tokens, space-group stratified sampling, and sub-bin refinement. Those adaptations are the concrete new pieces for this domain. The abstract lays out the pipeline and the numbers without much fluff.

The weak part is the attribution. The authors say HACO's cross-domain search plus sparse steering produced the model, yet the text gives no ablation that separates the effect of those specific choices from ordinary hyperparameter work on the same token setup. No error bars, no exact baseline re-implementations, and no data-split details appear in the abstract. Without those controls the performance lift stands, but the story that the co-scientist process drove it does not.

The work is aimed at groups that already track generative models for materials or that want to test AI-assisted algorithm search in domains with cheap validation. A reader who needs the full methods and any extra experiments would get something usable from it.

I would send it out for review so referees can check the implementation and request the missing ablations.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Human-AI Co-discovery system (HACO) that performs cross-domain search over generative modeling methods and, with sparse human steering, adapts MaskGIT into MaskGXT for crystal structure prediction by adding symmetry tokens, space-group stratified sampling, and sub-bin refinement. It reports that MaskGXT achieves 79.06% match-everyone-to-reference (METRe) accuracy on the MP-20 polymorph split (vs. 70.87% for the strongest baseline) and the best match rates on standard MP-20 and MPTS-52 CSP benchmarks, arguing this demonstrates the value of transfer-guided interactive AI co-scientists in domains with fast validation.

Significance. If the reported gains can be causally attributed to the HACO process, the work would provide a concrete example of algorithmic discovery via cross-domain transfer plus targeted domain guidance, with potential applicability to other scientific fields offering cheap, aligned validation oracles. The explicit benchmark numbers and focus on a well-defined task (CSP) make the result falsifiable and potentially reproducible if code and splits are released.

major comments (2)

[Abstract and §4] Abstract and §4 (Results): the central performance claim (79.06% METRe on the MP-20 polymorph split) is presented without error bars, statistical significance tests, or details on exact baseline implementations and data splits, so it is impossible to assess whether the 8.19-point lift over the 70.87% baseline is robust or could arise from standard hyperparameter search on the same discrete-token formulation.
[§3 and §4] §3 (Method) and §4: no ablation studies isolate the contribution of the HACO-identified elements (symmetry tokens, space-group stratified sampling, sub-bin coordinate refinement) from what a conventional transformer hyperparameter sweep on the base MaskGIT discrete-token model would achieve; this ablation is load-bearing for the claim that the co-scientist process itself produced the improvement.

minor comments (2)

[Abstract] The acronym expansion for HACO appears only after first use; spelling it out on first mention would improve readability.
[Abstract] Notation for METRe is introduced without an explicit equation or reference to its definition in prior CSP literature.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the presentation of our results. We address each major comment below.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Results): the central performance claim (79.06% METRe on the MP-20 polymorph split) is presented without error bars, statistical significance tests, or details on exact baseline implementations and data splits, so it is impossible to assess whether the 8.19-point lift over the 70.87% baseline is robust or could arise from standard hyperparameter search on the same discrete-token formulation.

Authors: We agree that the current presentation would benefit from additional statistical details. In the revised manuscript we will report error bars from multiple runs with different random seeds, include statistical significance tests comparing MaskGXT to the baseline, and expand the experimental section with precise descriptions of baseline implementations (including hyperparameter ranges explored) and the exact train/validation/test splits for the MP-20 polymorph setting. revision: yes
Referee: [§3 and §4] §3 (Method) and §4: no ablation studies isolate the contribution of the HACO-identified elements (symmetry tokens, space-group stratified sampling, sub-bin coordinate refinement) from what a conventional transformer hyperparameter sweep on the base MaskGIT discrete-token model would achieve; this ablation is load-bearing for the claim that the co-scientist process itself produced the improvement.

Authors: The 70.87% baseline already reflects the strongest performance obtainable from standard adaptations of MaskGIT to discrete tokens (including hyperparameter tuning) without the domain-specific modifications identified by HACO. Symmetry tokens and space-group stratified sampling are not components that arise from a conventional hyperparameter sweep on the base architecture; they were introduced only after the co-scientist process highlighted transferable principles from vision and crystallography. We will clarify this distinction in §3 and add a limited ablation table in the revision showing performance when each HACO-derived component is removed individually. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper reports an empirical discovery process (HACO identifying MaskGIT and adding domain-specific modifications) whose central claims are performance numbers on external benchmarks (MP-20 polymorph split, standard MP-20, MPTS-52). These are measured against independent baselines and datasets rather than being derived from quantities defined inside the paper. No equations, self-citations, or fitted parameters are presented as load-bearing derivations that reduce to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Abstract-only review yields limited visibility into parameters or assumptions; the ledger reflects only elements explicitly named in the provided text.

axioms (1)

domain assumption Masked generative models from vision can be instantiated as discrete token models for crystal structures
The paper states that HACO identified MaskGIT and instantiated it as a discrete token model of crystal structure.

invented entities (2)

HACO no independent evidence
purpose: Human-AI Co-discovery system for algorithm search
Introduced as the overall framework that performs cross-domain search and sparse steering.
MaskGXT no independent evidence
purpose: Masked Generative Crystal Transformer model
The final instantiated model after adding symmetry tokens and sampling strategies.

pith-pipeline@v0.9.1-grok · 5740 in / 1301 out tokens · 35002 ms · 2026-06-26T09:00:37.040071+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 9 linked inside Pith

[1]

Accessed: 2026-06-14. J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. van den Berg. Structured denoising diffusion models in discrete state-spaces. InAdvances in Neural Information Processing Systems (NeurIPS),

2026
[2]

[Accessed 03-05-2024]. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh. Machine learning for molecular and materials science.Nature, 559(7715):547–555,

2024
[3]

Autoscientists: Self-organizingagentteamsforlong-runningscientificexperimentation

S.Gao, A.Fang, andM.Zitnik. Autoscientists: Self-organizingagentteamsforlong-runningscientificexperimentation. arXiv preprint arXiv:2605.28655,

Pith/arXiv arXiv
[4]

Gottweis, W.-H

J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, A. Palepu, P. Sirkovic, A. Myaskovsky, F. Weissenberger, K. Rong, R. Tanno, K. Saab, D. Popovici, J. Blum, F. Zhang, K. Chou, A. Hassidim, B. Gokturk, A. Vahdat, P. Kohli, Y. Matias, A. Carroll, K. Kulkarni, N. Tomasev, Y. Guan, and V. Natarajan. Towards an AI co-scientist.arXiv preprint arXiv:2502.18864,

Pith/arXiv arXiv
[5]

Grosnit, A

A. Grosnit, A. Maraval, J. Doran, G. Paolo, A. Thomas, R. S. H. N. Beevi, J. Gonzalez, K. Khandelwal, I. Ia- cobacci, A. Benechehab, H. Cherkaoui, Y. A. El-Hili, K. Shao, J. Hao, J. Yao, B. Kégl, H. Bou-Ammar, and J. Wang. Large language models orchestrating structured reasoning achieve kaggle grandmaster level.arXiv preprint arXiv:2411.03562,

arXiv
[6]

Jiang, D

Z. Jiang, D. Schmidt, D. Srikanth, D. Xu, I. Kaplan, D. Jacenko, and Y. Wu. AIDE: AI-driven exploration in the space of code.arXiv preprint arXiv:2502.13138,

Pith/arXiv arXiv
[7]

URLhttps://arxiv.org/abs/2309.04475. R. Jiao, W. Huang, Y. Liu, D. Zhao, and Y. Liu. Space group constrained crystal generation. InInternational Conference on Learning Representations,

arXiv
[8]

URLhttps://arxiv.org/abs/2402.03992. A. Karpathy. AutoResearch: Ai agents running research on single-gpu nanochat training automatically.https: //github.com/karpathy/autoresearch,

arXiv
[9]

Accessed: 2026-06-15

GitHub repository. Accessed: 2026-06-15. N. Kazeev, W. Nong, I. Romanov, R. Zhu, A. Ustyuzhanin, S. Yamazaki, and K. Hippalgaonkar. Wyckoff transformer: Generation of symmetric crystals. InInternational Conference on Machine Learning,

2026
[10]

org/abs/2503.02407

URLhttps://arxiv. org/abs/2503.02407. F. E. Kelvinius, O. B. Andersson, A. S. Parackal, D. Qian, R. Armiento, and F. Lindsten. WyckoffDiff–a generative diffusion model for crystal symmetry. InForty-second International Conference on Machine Learning,

arXiv
[11]

URLhttps://arxiv.org/abs/2502.03638. J. Liu, S. Qiu, M. Li, B. Li, H. Ji, S. Han, X. Ye, P. Xia, Z. Dong, C. Zhang, et al. Autoresearchclaw: Self-reinforcing autonomous research with human-ai collaboration.arXiv preprint arXiv:2605.20025,

arXiv
[12]

C. Lu, C. Lu, R. T. Lange, J. Foerster, J. Clune, and D. Ha. The AI scientist: Towards fully automated open-ended scientific discovery.arXiv preprint arXiv:2408.06292,

Pith/arXiv arXiv
[13]

URLhttps://arxiv.org/abs/2406.04713. R. Müller, S. Kornblith, and G. Hinton. When does label smoothing help? InAdvances in Neural Information Processing Systems (NeurIPS),

arXiv
[14]

Nathani, L

D. Nathani, L. Madaan, N. Roberts, N. Bashlykov, A. Menon, V. Moens, A. Budhiraja, D. Magka, V. Vorotilov, G. Chaurasia, D. Hupkes, R. S. Cabral, T. Shavrina, J. Foerster, Y. Bachrach, W. Y. Wang, and R. Raileanu. MLGym: A new framework and benchmark for advancing AI research agents.arXiv preprint arXiv:2502.14499,

arXiv
[15]

Novikov, N

A. Novikov, N. V˜ u, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. R. Ruiz, A. Mehrabian, M. P. Kumar, A. See, S. Chaudhuri, G. Holland, A. Davies, S. Nowozin, P. Kohli, and M. Balog. AlphaEvolve: A coding agent for scientific and algorithmic discovery.arXiv preprint arXiv:2506.13131,

Pith/arXiv arXiv
[16]

Seong, S

K. Seong, S. Ahn, S. Han, and C. Park. Multimodal crystal flow: Any-to-any modality generation for unified crystal modeling.arXiv preprint arXiv:2602.20210,

Pith/arXiv arXiv
[17]

N. Shazeer. Glu variants improve transformer.arXiv preprint arXiv:2002.05202,

Pith/arXiv arXiv 2002
[18]

T. H. Veljković, J. Rosenthal, I. Lončarić, and J.-W. van de Meent. Crystalite: A lightweight transformer for efficient crystal modeling.arXiv preprint arXiv:2604.02270,

arXiv
[19]

J. Wei, Y. Yang, X. Zhang, et al. From AI for science to agentic science: A survey on autonomous scientific discovery. arXiv preprint arXiv:2508.14111,

arXiv
[20]

H. Wijk, T. Lin, J. Becker, S. Jawhar, N. Parikh, T. Broadley, L. Chan, M. Chen, J. Clymer, J. Dhyani, E. Ericheva, K. Garcia, B. Goodrich, N. Jurkovic, M. Kinniment, A. Lajko, S. Nix, L. Sato, W. Saunders, M. Taran, B. West, and E. Barnes. RE-bench: Evaluating frontier AI R&D capabilities of language model agents against human experts. arXiv preprint arX...

arXiv
[21]

Yamada, R

Y. Yamada, R. T. Lange, C. Lu, S. Hu, C. Lu, J. Foerster, J. Clune, and D. Ha. The AI scientist-v2: Workshop-level automated scientific discovery via agentic tree search.arXiv preprint arXiv:2504.08066,

Pith/arXiv arXiv
[22]

URLhttps://arxiv.org/ abs/2312.03687. R. Zhu, W. Nong, S. Yamazaki, and K. Hippalgaonkar. WyCryst: Wyckoff inorganic crystal generator framework. Matter, 7(10):3469–3488,

arXiv
[23]

H. P. Zou et al. LLM-based human-agent collaboration and interaction systems: A survey.arXiv preprint arXiv:2505.00753,

Pith/arXiv arXiv

[1] [1]

Accessed: 2026-06-14. J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. van den Berg. Structured denoising diffusion models in discrete state-spaces. InAdvances in Neural Information Processing Systems (NeurIPS),

2026

[2] [2]

[Accessed 03-05-2024]. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh. Machine learning for molecular and materials science.Nature, 559(7715):547–555,

2024

[3] [3]

Autoscientists: Self-organizingagentteamsforlong-runningscientificexperimentation

S.Gao, A.Fang, andM.Zitnik. Autoscientists: Self-organizingagentteamsforlong-runningscientificexperimentation. arXiv preprint arXiv:2605.28655,

Pith/arXiv arXiv

[4] [4]

Gottweis, W.-H

J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, A. Palepu, P. Sirkovic, A. Myaskovsky, F. Weissenberger, K. Rong, R. Tanno, K. Saab, D. Popovici, J. Blum, F. Zhang, K. Chou, A. Hassidim, B. Gokturk, A. Vahdat, P. Kohli, Y. Matias, A. Carroll, K. Kulkarni, N. Tomasev, Y. Guan, and V. Natarajan. Towards an AI co-scientist.arXiv preprint arXiv:2502.18864,

Pith/arXiv arXiv

[5] [5]

Grosnit, A

A. Grosnit, A. Maraval, J. Doran, G. Paolo, A. Thomas, R. S. H. N. Beevi, J. Gonzalez, K. Khandelwal, I. Ia- cobacci, A. Benechehab, H. Cherkaoui, Y. A. El-Hili, K. Shao, J. Hao, J. Yao, B. Kégl, H. Bou-Ammar, and J. Wang. Large language models orchestrating structured reasoning achieve kaggle grandmaster level.arXiv preprint arXiv:2411.03562,

arXiv

[6] [6]

Jiang, D

Z. Jiang, D. Schmidt, D. Srikanth, D. Xu, I. Kaplan, D. Jacenko, and Y. Wu. AIDE: AI-driven exploration in the space of code.arXiv preprint arXiv:2502.13138,

Pith/arXiv arXiv

[7] [7]

URLhttps://arxiv.org/abs/2309.04475. R. Jiao, W. Huang, Y. Liu, D. Zhao, and Y. Liu. Space group constrained crystal generation. InInternational Conference on Learning Representations,

arXiv

[8] [8]

URLhttps://arxiv.org/abs/2402.03992. A. Karpathy. AutoResearch: Ai agents running research on single-gpu nanochat training automatically.https: //github.com/karpathy/autoresearch,

arXiv

[9] [9]

Accessed: 2026-06-15

GitHub repository. Accessed: 2026-06-15. N. Kazeev, W. Nong, I. Romanov, R. Zhu, A. Ustyuzhanin, S. Yamazaki, and K. Hippalgaonkar. Wyckoff transformer: Generation of symmetric crystals. InInternational Conference on Machine Learning,

2026

[10] [10]

org/abs/2503.02407

URLhttps://arxiv. org/abs/2503.02407. F. E. Kelvinius, O. B. Andersson, A. S. Parackal, D. Qian, R. Armiento, and F. Lindsten. WyckoffDiff–a generative diffusion model for crystal symmetry. InForty-second International Conference on Machine Learning,

arXiv

[11] [11]

URLhttps://arxiv.org/abs/2502.03638. J. Liu, S. Qiu, M. Li, B. Li, H. Ji, S. Han, X. Ye, P. Xia, Z. Dong, C. Zhang, et al. Autoresearchclaw: Self-reinforcing autonomous research with human-ai collaboration.arXiv preprint arXiv:2605.20025,

arXiv

[12] [12]

C. Lu, C. Lu, R. T. Lange, J. Foerster, J. Clune, and D. Ha. The AI scientist: Towards fully automated open-ended scientific discovery.arXiv preprint arXiv:2408.06292,

Pith/arXiv arXiv

[13] [13]

URLhttps://arxiv.org/abs/2406.04713. R. Müller, S. Kornblith, and G. Hinton. When does label smoothing help? InAdvances in Neural Information Processing Systems (NeurIPS),

arXiv

[14] [14]

Nathani, L

D. Nathani, L. Madaan, N. Roberts, N. Bashlykov, A. Menon, V. Moens, A. Budhiraja, D. Magka, V. Vorotilov, G. Chaurasia, D. Hupkes, R. S. Cabral, T. Shavrina, J. Foerster, Y. Bachrach, W. Y. Wang, and R. Raileanu. MLGym: A new framework and benchmark for advancing AI research agents.arXiv preprint arXiv:2502.14499,

arXiv

[15] [15]

Novikov, N

A. Novikov, N. V˜ u, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. R. Ruiz, A. Mehrabian, M. P. Kumar, A. See, S. Chaudhuri, G. Holland, A. Davies, S. Nowozin, P. Kohli, and M. Balog. AlphaEvolve: A coding agent for scientific and algorithmic discovery.arXiv preprint arXiv:2506.13131,

Pith/arXiv arXiv

[16] [16]

Seong, S

K. Seong, S. Ahn, S. Han, and C. Park. Multimodal crystal flow: Any-to-any modality generation for unified crystal modeling.arXiv preprint arXiv:2602.20210,

Pith/arXiv arXiv

[17] [17]

N. Shazeer. Glu variants improve transformer.arXiv preprint arXiv:2002.05202,

Pith/arXiv arXiv 2002

[18] [18]

T. H. Veljković, J. Rosenthal, I. Lončarić, and J.-W. van de Meent. Crystalite: A lightweight transformer for efficient crystal modeling.arXiv preprint arXiv:2604.02270,

arXiv

[19] [19]

J. Wei, Y. Yang, X. Zhang, et al. From AI for science to agentic science: A survey on autonomous scientific discovery. arXiv preprint arXiv:2508.14111,

arXiv

[20] [20]

H. Wijk, T. Lin, J. Becker, S. Jawhar, N. Parikh, T. Broadley, L. Chan, M. Chen, J. Clymer, J. Dhyani, E. Ericheva, K. Garcia, B. Goodrich, N. Jurkovic, M. Kinniment, A. Lajko, S. Nix, L. Sato, W. Saunders, M. Taran, B. West, and E. Barnes. RE-bench: Evaluating frontier AI R&D capabilities of language model agents against human experts. arXiv preprint arX...

arXiv

[21] [21]

Yamada, R

Y. Yamada, R. T. Lange, C. Lu, S. Hu, C. Lu, J. Foerster, J. Clune, and D. Ha. The AI scientist-v2: Workshop-level automated scientific discovery via agentic tree search.arXiv preprint arXiv:2504.08066,

Pith/arXiv arXiv

[22] [22]

URLhttps://arxiv.org/ abs/2312.03687. R. Zhu, W. Nong, S. Yamazaki, and K. Hippalgaonkar. WyCryst: Wyckoff inorganic crystal generator framework. Matter, 7(10):3469–3488,

arXiv

[23] [23]

H. P. Zou et al. LLM-based human-agent collaboration and interaction systems: A survey.arXiv preprint arXiv:2505.00753,

Pith/arXiv arXiv