Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence

Fiona Y. Wang; Markus J. Buehler

arxiv: 2606.01444 · v1 · pith:GZOTEYPNnew · submitted 2026-05-31 · 💻 cs.AI · cond-mat.mtrl-sci· cs.CL· cs.LG· math.CT

Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence

Fiona Y. Wang , Markus J. Buehler This is my paper

Pith reviewed 2026-06-28 16:49 UTC · model grok-4.3

classification 💻 cs.AI cond-mat.mtrl-scics.CLcs.LGmath.CT

keywords category theoryscientific discoveryagentic AIleft Kan extensionregime transitioncopresheafmaterials scienceself-revising systems

0 comments

The pith

Scientific discovery is a verified regime transition between schema categories, with old states transported by left Kan extension to expose residuals beyond functorial preservation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that discovery requires changing the schema category that defines how evidence and operations are typed, rather than operating inside a fixed regime. Inside one regime the state is a copresheaf on the schema and updates are endofunctorial; crossing regimes uses a functor u whose left Kan extension carries prior artifacts forward so that anything left over counts as discovered content. Two concrete systems illustrate the distinction: one revises a protein-mechanics model under a minimum-description-length gate, the other builds a proof-carrying graph for fiber-network surrogates that records rejected alternatives and accepted anisotropic stiffness laws. A reader would care because the framework supplies an objective, non-subjective criterion that separates retrieval and search from discovery while remaining executable inside agentic AI.

Core claim

In a fixed regime b the system state is the copresheaf I_t on schema S_b with provenance given by the category of elements; discovery occurs precisely when a verified transition u: S_b → S_b' is performed, the prior state is transported by the left Kan extension Lan_u I_t, and the post-transition state is compared to isolate residual content that cannot be explained by the transported artifacts.

What carries the argument

Left Kan extension Lan_u along a schema functor u, which transports copresheaf states across regime boundaries while provenance (category of elements) records what is preserved.

If this is right

In the Builder/Breaker system the accepted law is mode-conditioned compliance expressing within-chain flexibility as all-mode elastic compliance conditioned by slow collective modes.
In CategoryScienceClaw the accepted fiber-network model is an orientation-tensor anisotropic stiffness surrogate over an isotropic fiber-count descriptor, after an AIC gate and perturbation tests.
The same machinery separates retrieval (no regime change), search (regime-preserving queries), and discovery (verified regime transition with residual detection).
Both systems produce a proof-carrying knowledge-computation graph that records candidate models, rejected alternatives, gates, and accepted laws.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same transition-plus-residual test could be applied to non-materials domains such as chemistry or biology if the schema categories are supplied.
An AI that maintains an explicit category of elements for provenance could audit its own history of regime changes without external labels.
If the residual test is made computable, it supplies a concrete objective function for training agents to seek discovery rather than reward maximization inside a fixed language.

Load-bearing premise

Category-theoretic structures such as copresheaves, categories of elements, and left Kan extensions can be instantiated directly in working AI systems as a complete model of scientific discovery.

What would settle it

A working implementation of the framework that, on a documented materials discovery task, either fails to flag a human-recognized discovery or incorrectly labels transported content as residual.

Figures

Figures reproduced from arXiv: 2606.01444 by Fiona Y. Wang, Markus J. Buehler.

**Figure 1.** Figure 1: Retrieval, search, and discovery are structurally different operations. Retrieval adds an already representable artifact. Search finds a new path or object inside a fixed schema. Discovery changes the regime in which artifacts and operations are typed. Yet the central question for these systems remains underformalized. Existing AI scientists are extraordinarily fluent at recombining, optimizing, and reform… view at source ↗

**Figure 2.** Figure 2: A fixed regime has a schema category Sb of types and operations. A copresheaf It : Sb → Set assigns actual artifacts to each type. The category of elements R It is the realized typed artifact DAG. 2 Results and Discussion 2.1 Agentic discovery systems are typed artifact systems An agentic discovery system is best understood as a typed artifact system. Its persistent state is not a conversation transcript, … view at source ↗

**Figure 3.** Figure 3: Fixed-regime operation is the update Φb inside a schema Sb. A committed fixed-regime step is represented by a refinement δt : It → It+1 associated with the object-level update It+1 = Φb(It). The lower dashed arrow is Lanu(δt) : LanuIt → LanuIt+1, not an independent new-regime dynamics. Thus the left square commutes by functoriality of Lanu on refinement morphisms. Discovery enters through the comparison ma… view at source ↗

**Figure 4.** Figure 4: Kan-transport audit of the Builder/Breaker protein-mechanics run. (A) A verified transition transports the old artifact state by Lanu and compares it with the accepted new state by ρ¯; residual content records what is added beyond functorial transport. (B) In the final transition, log-compliance and shifted ReLU mode participation are generator-reachable transformations of old physics-derived quantities, w… view at source ↗

**Figure 5.** Figure 5: Parsimonious scaling of the discovered world model. (A) Evolution of the world-model DAG across discovery iterations (0–3), read left to right through four stages: inputs (observables), factors (nonlinear transforms, e.g. thresholded ReLU terms), features (terms entering the linear predictor), and the target (B-factor, z-scored). Nodes are colored by stage, edges tinted by source, and nodes new to an itera… view at source ↗

**Figure 6.** Figure 6: Inner MDL-guided search within a discovery iteration (data from [10]). (A) Hill-climb frontier: total description length (bits) versus proposal step. Faint grey points are rejected proposals; the stepped teal curve is the best-so-far frontier through the accepted moves (numbered markers), which together reduce the description length by 337.6 bits over 16 accepted moves (from 2354.1 to 2016.5 bits). (B) Led… view at source ↗

**Figure 7.** Figure 7: Anatomy of the MDL gate across the discovery run (data from [10]). (A) Gate selectivity by proposal operator: the number of proposals accepted versus proposed, aggregated over all iterations. Of 388 proposals only 25 are accepted (6.4%), and the acceptance rate is strongly operator-dependent (structure-recombining moves survive most often (seed 21%, swap 11%) while bare feature additions rarely do (add 3%)… view at source ↗

**Figure 8.** Figure 8: Feature lifecycle across the discovery run, computed from the accepted moves of each iteration’s inner search. Each horizontal bar is a feature slot, with markers for its introduction (born, by an add or seed move), factor swaps, threshold tunings, and removal. Slots present at the end of an iteration are kept (teal); slots removed before then are retracted (grey), with the model commitment dropped while i… view at source ↗

**Figure 9.** Figure 9: ScienceClaw × Infinite as a distributed typed artifact system. ScienceClaw executes typed skill compositions and records immutable lineage; the ArtifactReactor and mutation layer coordinate active search; Infinite turns computational artifacts into public scientific discourse with feedback that can re-enter the discovery loop. 2.7 CategoryScienceClaw fiber-network mechanics as a typed discovery graph Categ… view at source ↗

**Figure 10.** Figure 10: CategoryScienceClaw fiber-network mechanics figure. The figure renders the typed path from a fiber-network mechanics question to typed inputs, candidate models, an accepted orientation-tensor anisotropic stiffness surrogate, a rejected isotropic fiber-count descriptor, an AIC gate, perturbation stress test, regime-transition record, and synthesized scientific report. The result supports anisotropic mechan… view at source ↗

read the original abstract

Scientific discovery is not only answer generation but revision of the representational regime in which evidence, artifacts, operations, and verifiers are typed. We develop a category-theoretic account of agentic discovery for materials science. In a fixed regime b with schema category S_b, the system state is a copresheaf I_t: S_b -> Set, and provenance is the category of elements \int_{S_b} I_t. Fixed-regime operation is an update on such states, endofunctorial only when provenance-preserving refinements are specified and preserved. Discovery is instead a verified regime transition u: S_b -> S_b': old artifacts are preserved, transported by the left Kan extension Lan_u I_t, and compared with the post-transition state to identify residual content beyond functorial transport. This separates retrieval, search, and discovery without subjective novelty. We instantiate the framework in two systems. In Builder/Breaker, a protein-mechanics world model is revised under a Minimum Description Length gate; the accepted law expresses within-chain flexibility as all-mode elastic compliance conditioned by slow collective-mode participation, or mode-conditioned compliance. In CategoryScienceClaw, typed skills, artifacts, open needs, workflow mutation, gates, stress tests, and public discourse become a proof-carrying knowledge-computation graph. A fiber-network example records candidate models, rejected alternatives, an AIC gate, perturbation tests, and an accepted orientation-tensor anisotropic stiffness surrogate over an isotropic fiber-count descriptor. Together, the cases show how category theory can be both a mathematical language for discovery and an engineering specification for self-revising AI discovery systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames discovery as a regime shift via left Kan extension on copresheaves to isolate residuals, but the two case studies describe outcomes without showing the actual categorical constructions or computations.

read the letter

The main takeaway is a category-theoretic specification for agentic discovery systems that treats discovery as a verified transition between schema categories, using left Kan extension to transport prior state and flag residual content as the discovery step. This aims to make the distinction from retrieval and search objective rather than subjective.

What stands out is the attempt to give an engineering blueprint using copresheaves, category of elements for provenance, and Kan extensions for transport. The two instantiations—one revising a protein-mechanics model under MDL to accept mode-conditioned compliance, the other building a fiber-network model with AIC gate and orientation-tensor surrogate—illustrate how gates and stress tests fit into a larger workflow. That concrete mapping from abstract operations to materials examples is useful for readers thinking about implementable self-revising agents.

The soft spot is that the load-bearing claim never gets demonstrated in the cases. The abstract and descriptions stay at the level of natural-language results and gate decisions; they do not exhibit the schema categories S_b and S_b', the copresheaf I_t, the transition functor u, the explicit Lan_u computation, or the residual comparison. Without those steps shown, the separation of discovery from search remains asserted rather than verified. The circularity risk noted in the definition is real here: discovery is defined via the framework's own operations, so external validation is needed to confirm it tracks actual scientific progress.

This is for people working on formal methods or category theory in AI for science, especially materials discovery workflows. It deserves a serious referee because the framing is original and the engineering intent is clear, even if the current evidence is thin and the central mechanism needs explicit construction in the full text.

Referee Report

1 major / 0 minor

Summary. The paper develops a category-theoretic framework for agentic AI discovery in science, modeling a fixed regime via schema category S_b with state as copresheaf I_t: S_b → Set and provenance as the category of elements. Discovery is defined as a verified regime transition u: S_b → S_b' in which old artifacts are preserved and transported by the left Kan extension Lan_u I_t, with residual content beyond this transport identified as the discovery. The framework is instantiated in Builder/Breaker (protein-mechanics revision under an MDL gate yielding a mode-conditioned compliance law) and CategoryScienceClaw (fiber-network example with AIC gate, perturbation tests, and an accepted anisotropic stiffness surrogate).

Significance. If the central mechanism can be made explicit, the framework would supply a formal, non-subjective criterion for distinguishing discovery from retrieval and search, using copresheaves, provenance, and Kan extensions as both a mathematical language and an engineering specification for self-revising systems. The two case studies illustrate concrete scientific outcomes (accepted laws and rejected alternatives) that could serve as test cases for the approach.

major comments (1)

[Abstract and instantiations] Abstract (instantiations of Builder/Breaker and CategoryScienceClaw): the load-bearing claim is that discovery equals residual content after functorial transport of I_t by Lan_u along regime transition u. The descriptions supply only natural-language outcomes, gates (MDL, AIC), and final laws; they exhibit neither the schema categories S_b and S_b', the copresheaf I_t, the functor u, the left Kan extension Lan_u I_t, nor the explicit comparison that isolates the residual. Without these constructions the separation of discovery from search/retrieval remains an assertion rather than a demonstrated property of the framework.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive critique. The major comment correctly identifies that the abstract and case-study descriptions do not exhibit the explicit categorical data. We will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract and instantiations] Abstract (instantiations of Builder/Breaker and CategoryScienceClaw): the load-bearing claim is that discovery equals residual content after functorial transport of I_t by Lan_u along regime transition u. The descriptions supply only natural-language outcomes, gates (MDL, AIC), and final laws; they exhibit neither the schema categories S_b and S_b', the copresheaf I_t, the functor u, the left Kan extension Lan_u I_t, nor the explicit comparison that isolates the residual. Without these constructions the separation of discovery from search/retrieval remains an assertion rather than a demonstrated property of the framework.

Authors: We agree that the abstract and the natural-language summaries of the two instantiations do not display the concrete objects S_b, S_b', I_t, u, Lan_u I_t or the residual comparison. The framework section supplies the general definitions, but the case studies do not instantiate them. In the revised manuscript we will add explicit constructions for both Builder/Breaker and CategoryScienceClaw, including the schema categories, the copresheaf state, the transition functor, the computed left Kan extension, and the explicit residual identified as discovery. This will convert the separation claim into a demonstrated computation. revision: yes

Circularity Check

1 steps flagged

Discovery defined as residual after Lan_u transport, making separation from search/retrieval true by construction

specific steps

self definitional [Abstract]
"Discovery is instead a verified regime transition u: S_b -> S_b': old artifacts are preserved, transported by the left Kan extension Lan_u I_t, and compared with the post-transition state to identify residual content beyond functorial transport. This separates retrieval, search, and discovery without subjective novelty."

The separation of discovery from retrieval/search 'without subjective novelty' is asserted as a consequence of identifying residual content beyond Lan_u transport, but this separation holds exactly by the paper's definition of discovery as that residual; the result is equivalent to the definitional premise rather than a derived property.

full rationale

The paper proposes a categorical framework in which discovery is explicitly defined using regime transitions, copresheaves, and left Kan extensions. The central claim that this 'separates retrieval, search, and discovery without subjective novelty' reduces directly to that definitional choice rather than an independent derivation or external benchmark. The two case studies describe outcomes at the level of natural-language laws and gates but do not exhibit the required schema categories, copresheaf I_t, functor u, or explicit Lan_u computation, leaving the separation unverified beyond the framework's own terms. No fitted-input predictions, self-citation chains, or imported uniqueness theorems appear in the provided text, so circularity is confined to the self-definitional core. The framework remains a coherent modeling proposal but does not derive its key separation property from anything external to the chosen categorical operations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on standard category theory plus domain assumptions about its applicability to discovery; no free parameters or new physical entities are introduced in the abstract.

axioms (1)

domain assumption Category theory supplies a suitable formal language for representing and revising scientific regimes, evidence, and operations.
The entire construction (schema categories, copresheaves, left Kan extensions for transport) is built on this premise.

invented entities (1)

verified regime transition via left Kan extension no independent evidence
purpose: To model discovery as preservation of artifacts plus detection of residual content beyond functorial transport.
Introduced as the core mechanism separating discovery from retrieval.

pith-pipeline@v0.9.1-grok · 5840 in / 1485 out tokens · 63762 ms · 2026-06-28T16:49:52.561039+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

72 extracted references · 10 canonical work pages

[1]

Buehler, M. J. Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Inter- pretive Large Language Model-Based Materials Design.ACS Engineering Au4, 241–277 (2024). URL https://doi.org/10.1021/acsengineeringau.3c00058

work page doi:10.1021/acsengineeringau.3c00058 2024
[2]

& Buehler, M

Ni, B. & Buehler, M. J. MechAgents: Large language model multi-agent collaborations can solve mechanics problems, generate new data, and integrate knowledge.Extreme Mechanics Letters67, 102131 (2024)

2024
[3]

& Buehler, M

Ghafarollahi, A. & Buehler, M. J. ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning.Digital Discovery3, 1389–1409 (2024)

2024
[4]

& Buehler, M

Ghafarollahi, A. & Buehler, M. J. Sparks: Multi-agent artificial intelligence model discovers protein design principles.arXiv preprint(2025). ArXiv:2504.19017

arXiv 2025
[5]

& Buehler, M

Ghafarollahi, A. & Buehler, M. J. SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning.Advanced Materials37, 2413523 (2025). URL https: //doi.org/10.1002/adma.202413523

work page doi:10.1002/adma.202413523 2025
[6]

ArXiv:2408.06292

Lu, C.et al.The AI scientist: Towards fully automated open-ended scientific discovery.arXiv preprint (2024). ArXiv:2408.06292

Pith/arXiv arXiv 2024
[7]

URL https://arxiv.org/abs/2504.08066

Yamada, Y.et al.The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search.arXiv preprint arXiv:2504.08066(2025). URL https://arxiv.org/abs/2504.08066. 2504.08066

Pith/arXiv arXiv 2025
[8]

InAdvances in Neural Information Processing Systems (NeurIPS 2025)(2025)

Agarwal, D.et al.AutoDiscovery: Open-ended scientific discovery via bayesian surprise. InAdvances in Neural Information Processing Systems (NeurIPS 2025)(2025). URLhttps://arxiv.org/abs/2507. 00310.2507.00310

arXiv 2025
[9]

Y.et al.Autonomous agents coordinating distributed discovery through emergent artifact exchange.arXiv preprint(2026)

Wang, F. Y.et al.Autonomous agents coordinating distributed discovery through emergent artifact exchange.arXiv preprint(2026). ArXiv:2603.14312

arXiv 2026
[10]

Buehler, M. J. Why We Must Break the World.Integrating Materials and Manufacturing Innovation (in press)(2026)

2026
[11]

Buehler, M. J. MeLM, a generative pretrained language modeling framework that solves forward and inverse mechanics problems.Journal of the Mechanics and Physics of Solids181, 105454 (2023)

2023
[12]

Buehler, M. J. PRefLexOR: Preference-Based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking.npj Artificial Intelligence1(2025). URL https://doi.org/10. 1038/s44387-025-00003-z

2025
[13]

R.The Logic of Scientific Discovery(Hutchinson, London, 1959)

Popper, K. R.The Logic of Scientific Discovery(Hutchinson, London, 1959). English translation of Logik der Forschung

1959
[14]

S.The Structure of Scientific Revolutions(University of Chicago Press, Chicago, 1962)

Kuhn, T. S.The Structure of Scientific Revolutions(University of Chicago Press, Chicago, 1962). 28 Self-Revising Discovery Systems for Science

1962
[15]

Falsification and the methodology of scientific research programmes

Lakatos, I. Falsification and the methodology of scientific research programmes. In Lakatos, I. & Musgrave, A. (eds.)Criticism and the Growth of Knowledge, 91–195 (Cambridge University Press, Cambridge, 1970)

1970
[16]

Mac Lane, S.Categories for the Working Mathematician(Springer, 1971)

1971
[17]

Awodey, S.Category Theory(Oxford University Press, 2010), 2nd edn

2010
[18]

I.Category Theory for the Sciences(MIT Press, 2014)

Spivak, D. I.Category Theory for the Sciences(MIT Press, 2014)

2014
[19]

& Spivak, D

Fong, B. & Spivak, D. I.An Invitation to Applied Category Theory: Seven Sketches in Compositionality (Cambridge University Press, 2019)

2019
[20]

Spivak, D. I. Functorial data migration.Information and Computation217, 31–51 (2012)

2012
[21]

Spivak, D. I. Poly: An Abundant Categorical Setting for Mode-Dependent Dynamics (2020). URL https://arxiv.org/abs/2005.01894.2005.01894

arXiv 2020
[22]

Spivak, D. I. Learners’ languages.arXiv preprint arXiv:2103.01189(2021)

arXiv 2021
[23]

I., Giesa, T., Wood, E

Spivak, D. I., Giesa, T., Wood, E. & Buehler, M. J. Category theoretic analysis of hierarchical protein materials and social networks.PLoS ONE6, e23911 (2011)

2011
[24]

Giesa, T., Spivak, D. I. & Buehler, M. J. Reoccurring patterns in hierarchical protein materials and music: The power of analogies.BioNanoScience1, 153–161 (2011)

2011
[25]

Giesa, T., Spivak, D. I. & Buehler, M. J. Category theory based solution for the building block replacement problem in materials design.Advanced Engineering Materials14, 810–817 (2012)

2012
[26]

Buehler, M. J. FieldPerceiver: Domain agnostic transformer model to predict multiscale physical fields and nonlinear material properties through neural ologs.Materials Today57, 9–25 (2022)

2022
[27]

Buehler, M. J. From Atoms to Swarms: The Categorical Spine of Multiscale Materials Modeling and Autonomous Discovery (2026). URLhttps://doi.org/10.26434/chemrxiv.15002850/v1. Preprint, version 1

work page doi:10.26434/chemrxiv.15002850/v1 2026
[28]

Goethe, J. W. v.Versuch die Metamorphose der Pflanzen zu erklaeren(Carl Wilhelm Ettinger, Gotha, 1790). English title: The Metamorphosis of Plants
[29]

Cranford, S. W. & Buehler, M. J. Materiomics: Biological Protein Materials, from Nano to Macro. Nanotechnology, Science and Applications3, 127–148 (2010). URLhttps://doi.org/10.2147/NSA. S9037

work page doi:10.2147/nsa 2010
[30]

Sampling-Based Risk-Aware Path Planning Around Dynamic Engagement Zones,

Lee, N. A., Shen, S. C. & Buehler, M. J. An Automated Biomateriomics Platform for Sustainable Programmable Materials Discovery.Matter5, 3597–3613 (2022). URLhttps://doi.org/10.1016/j. matt.2022.10.003

work page doi:10.1016/j 2022
[31]

Fish, J., Wagner, G. J. & Keten, S. Mesoscopic and Multiscale Modelling in Materials.Nature Materials 20, 774–786 (2021). URLhttps://doi.org/10.1038/s41563-020-00913-0

work page doi:10.1038/s41563-020-00913-0 2021
[32]

Buehler, M. J. & Genin, G. M. Integrated multiscale biomaterials experiment and modelling: a perspective. Interface Focus6, 20150098 (2016). URLhttp://rsfs.royalsocietypublishing.org/lookup/doi/ 10.1098/rsfs.2015.0098

work page doi:10.1098/rsfs.2015.0098 2016
[33]

E., Webb, M

Jackson, N. E., Webb, M. A. & de Pablo, J. J. Recent Advances in Machine Learning Towards Multiscale Soft Materials Design.Current Opinion in Chemical Engineering23, 106–114 (2019). URL https://doi.org/10.1016/j.coche.2019.03.005

work page doi:10.1016/j.coche.2019.03.005 2019
[34]

& Govindjee, S.Continuum Mechanics of Solids

Anand, L. & Govindjee, S.Continuum Mechanics of Solids. Oxford Graduate Texts (Oxford University Press, 2020)

2020
[35]

Shi, M., Jiao, Q., Yin, T., Vlassak, J. J. & Suo, Z. Hydrolysis Embrittles Poly(lactic Acid).MRS Bulletin 48, 45–55 (2023). URLhttps://doi.org/10.1557/s43577-022-00368-5

work page doi:10.1557/s43577-022-00368-5 2023
[36]

Tang, H., Buehler, M. J. & Moran, B. A constitutive model of soft tissue: from nanoscale collagen to tissue continuum.Annals of Biomedical Engineering37, 1117–1130 (2009)

2009
[37]

Notions of computation and monads.Information and Computation93, 55–92 (1991)

Moggi, E. Notions of computation and monads.Information and Computation93, 55–92 (1991)

1991
[38]

Tirion, M. M. Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Physical Review Letters77, 1905–1908 (1996). 29 Self-Revising Discovery Systems for Science

1905
[39]

Bahar, I., Atilgan, A. R. & Erman, B. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential.Folding and Design2, 173–181 (1997)

1997
[40]

& Erman, B

Haliloglu, T., Bahar, I. & Erman, B. Gaussian dynamics of folded proteins.Physical Review Letters79, 3090–3093 (1997)

1997
[41]

A., Hage, T

Stewart, I. A., Hage, T. P., Hsu, Y.-C. & Buehler, M. J. GraphAgents: Knowledge Graph-Guided Agentic AI for Cross-Domain Materials Design.arXiv preprint arXiv:2602.07491(2026). URLhttps: //arxiv.org/abs/2602.07491.2602.07491

arXiv 2026
[42]

P.et al.Mars: Hierarchical multi-agent reasoning systems enable knowledge-grounded material substitution (2026)

Hage, T. P.et al.Mars: Hierarchical multi-agent reasoning systems enable knowledge-grounded material substitution (2026)

2026
[43]

Bacon, F.Novum Organum(Apud Joannem Billium, London, 1620)
[44]

Parker, London, 1840)

Whewell, W.The Philosophy of the Inductive Sciences, Founded upon Their History(John W. Parker, London, 1840)
[45]

Peirce, C. S. The fixation of belief.Popular Science Monthly12, 1–15 (1877)
[46]

N.Science and the Modern World(The Macmillan Company, New York, 1925)

Whitehead, A. N.Science and the Modern World(The Macmillan Company, New York, 1925)

1925
[47]

Polanyi, M.Personal Knowledge: Towards a Post-Critical Philosophy(University of Chicago Press, Chicago, 1958)

1958
[48]

Hacking, I.Representing and Intervening: Introductory Topics in the Philosophy of Natural Science (Cambridge University Press, Cambridge, 1983)

1983
[49]

W3C Recommendation, World Wide Web Consortium (W3C) (2013).https://www.w3.org/TR/2013/REC-prov-o-20130430/

Lebo, T.et al.PROV-O: The PROV ontology. W3C Recommendation, World Wide Web Consortium (W3C) (2013).https://www.w3.org/TR/2013/REC-prov-o-20130430/

2013
[50]

Bechhofer, S.et al.Why linked data is not enough for scientists.Future Generation Computer Systems 29, 599–611 (2013)

2013
[51]

Davidson, S. B. & Freire, J. Provenance and scientific workflows: challenges and opportunities. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 1345–1350 (Association for Computing Machinery, 2008)

2008
[52]

Y.et al.Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge

Jaradeh, M. Y.et al.Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge. InProceedings of the 10th International Conference on Knowledge Capture (K-CAP ’19), 243–246 (Association for Computing Machinery, 2019)

2019
[53]

Fong, B., Spivak, D. I. & Tuyéras, R. Backprop as functor: A compositional perspective on supervised learning. InProceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–13 (2019)

2019
[54]

In Proceedings of the 41st International Conference on Machine Learning (ICML), 15209–15241 (2024)

Gavranović, B.et al.Position: Categorical deep learning is an algebraic theory of all architectures. In Proceedings of the 41st International Conference on Machine Learning (ICML), 15209–15241 (2024)

2024
[55]

Cruttwell, G. S. H., Gavranović, B., Ghani, N., Wilson, P. W. & Zanasi, F. Deep learning with parametric lenses.arXiv preprint(2024). ArXiv:2404.00408

arXiv 2024
[56]

Crescenzi, F. R. Towards a categorical foundation of deep learning: A survey.arXiv preprint(2024). ArXiv:2410.05353

arXiv 2024
[57]

& Mendler, N

Aczel, P. & Mendler, N. A final coalgebra theorem. InCategory Theory and Computer Science, 357–365 (1989)

1989
[58]

Rutten, J. J. M. M. Universal coalgebra: A theory of systems.Theoretical Computer Science249, 3–80 (2000)

2000
[59]

Modeling by shortest data description.Automatica14, 465–471 (1978)

Rissanen, J. Modeling by shortest data description.Automatica14, 465–471 (1978)

1978
[60]

D.The Minimum Description Length Principle(MIT Press, 2007)

Grünwald, P. D.The Minimum Description Length Principle(MIT Press, 2007)

2007
[61]

Solomonoff, R. J. A formal theory of inductive inference, parts I and II.Information and Control7, 1–22, 224–254 (1964)

1964
[62]

Hutter, M.Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability (Springer, 2005)

2005
[63]

Estimating the dimension of a model.Annals of Statistics6, 461–464 (1978)

Schwarz, G. Estimating the dimension of a model.Annals of Statistics6, 461–464 (1978). 30 Self-Revising Discovery Systems for Science

1978
[64]

O., Lehman, J

Stanley, K. O., Lehman, J. & Soros, L. Open-endedness: The last grand challenge you’ve never heard of. O’Reilly Online(2017)

2017
[65]

& Stanley, K

Wang, R., Lehman, J., Clune, J. & Stanley, K. O. Paired open-ended trailblazer (POET): Endlessly generating increasingly complex and diverse learning environments and their solutions.arXiv preprint (2019). ArXiv:1901.01753

Pith/arXiv arXiv 2019
[66]

X., Chen, C.-T

Gu, G. X., Chen, C.-T. & Buehler, M. J. De novo composite design based on machine learning algorithm. Extreme Mech. Lett18, 19–28 (2018)

2018
[67]

E.et al.Physics-informed machine learning.Nature Reviews Physics3, 422–440 (2021)

Karniadakis, G. E.et al.Physics-informed machine learning.Nature Reviews Physics3, 422–440 (2021)

2021
[68]

& Karniadakis, G

Lu, L., Jin, P., Pang, G., Zhang, Z. & Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence3, 218–229 (2021)

2021
[69]

Kevrekidis, I. G.et al.Equation-free, coarse-grained multiscale computation: Enabling microscopic simulators to perform system-level analysis.Communications in Mathematical Sciences1, 715–762 (2003)

2003
[70]

Brunton, Joshua L

Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proceedings of the National Academy of Sciences113, 3932–3937 (2016). URLhttps://www.pnas.org/doi/abs/10.1073/pnas.1517384113. https://www. pnas.org/doi/pdf/10.1073/pnas.1517384113

work page doi:10.1073/pnas.1517384113 2016
[71]

URL https://arxiv.org/abs/2006.11287.2006.11287

Cranmer, M.et al.Discovering symbolic models from deep learning with inductive biases (2020). URL https://arxiv.org/abs/2006.11287.2006.11287

arXiv 2020
[72]

Leinster, T.Higher Operads, Higher Categories(Cambridge University Press, 2004). 31 Self-Revising Discovery Systems for Science Supplementary Information Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence Fiona Y. Wang1,2 Markus J. Buehler2,3,4* 1Laboratory for Atomistic and Molecular Mechanics, MIT 2D...

2004

[1] [1]

Buehler, M. J. Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Inter- pretive Large Language Model-Based Materials Design.ACS Engineering Au4, 241–277 (2024). URL https://doi.org/10.1021/acsengineeringau.3c00058

work page doi:10.1021/acsengineeringau.3c00058 2024

[2] [2]

& Buehler, M

Ni, B. & Buehler, M. J. MechAgents: Large language model multi-agent collaborations can solve mechanics problems, generate new data, and integrate knowledge.Extreme Mechanics Letters67, 102131 (2024)

2024

[3] [3]

& Buehler, M

Ghafarollahi, A. & Buehler, M. J. ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning.Digital Discovery3, 1389–1409 (2024)

2024

[4] [4]

& Buehler, M

Ghafarollahi, A. & Buehler, M. J. Sparks: Multi-agent artificial intelligence model discovers protein design principles.arXiv preprint(2025). ArXiv:2504.19017

arXiv 2025

[5] [5]

& Buehler, M

Ghafarollahi, A. & Buehler, M. J. SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning.Advanced Materials37, 2413523 (2025). URL https: //doi.org/10.1002/adma.202413523

work page doi:10.1002/adma.202413523 2025

[6] [6]

ArXiv:2408.06292

Lu, C.et al.The AI scientist: Towards fully automated open-ended scientific discovery.arXiv preprint (2024). ArXiv:2408.06292

Pith/arXiv arXiv 2024

[7] [7]

URL https://arxiv.org/abs/2504.08066

Yamada, Y.et al.The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search.arXiv preprint arXiv:2504.08066(2025). URL https://arxiv.org/abs/2504.08066. 2504.08066

Pith/arXiv arXiv 2025

[8] [8]

InAdvances in Neural Information Processing Systems (NeurIPS 2025)(2025)

Agarwal, D.et al.AutoDiscovery: Open-ended scientific discovery via bayesian surprise. InAdvances in Neural Information Processing Systems (NeurIPS 2025)(2025). URLhttps://arxiv.org/abs/2507. 00310.2507.00310

arXiv 2025

[9] [9]

Y.et al.Autonomous agents coordinating distributed discovery through emergent artifact exchange.arXiv preprint(2026)

Wang, F. Y.et al.Autonomous agents coordinating distributed discovery through emergent artifact exchange.arXiv preprint(2026). ArXiv:2603.14312

arXiv 2026

[10] [10]

Buehler, M. J. Why We Must Break the World.Integrating Materials and Manufacturing Innovation (in press)(2026)

2026

[11] [11]

Buehler, M. J. MeLM, a generative pretrained language modeling framework that solves forward and inverse mechanics problems.Journal of the Mechanics and Physics of Solids181, 105454 (2023)

2023

[12] [12]

Buehler, M. J. PRefLexOR: Preference-Based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking.npj Artificial Intelligence1(2025). URL https://doi.org/10. 1038/s44387-025-00003-z

2025

[13] [13]

R.The Logic of Scientific Discovery(Hutchinson, London, 1959)

Popper, K. R.The Logic of Scientific Discovery(Hutchinson, London, 1959). English translation of Logik der Forschung

1959

[14] [14]

S.The Structure of Scientific Revolutions(University of Chicago Press, Chicago, 1962)

Kuhn, T. S.The Structure of Scientific Revolutions(University of Chicago Press, Chicago, 1962). 28 Self-Revising Discovery Systems for Science

1962

[15] [15]

Falsification and the methodology of scientific research programmes

Lakatos, I. Falsification and the methodology of scientific research programmes. In Lakatos, I. & Musgrave, A. (eds.)Criticism and the Growth of Knowledge, 91–195 (Cambridge University Press, Cambridge, 1970)

1970

[16] [16]

Mac Lane, S.Categories for the Working Mathematician(Springer, 1971)

1971

[17] [17]

Awodey, S.Category Theory(Oxford University Press, 2010), 2nd edn

2010

[18] [18]

I.Category Theory for the Sciences(MIT Press, 2014)

Spivak, D. I.Category Theory for the Sciences(MIT Press, 2014)

2014

[19] [19]

& Spivak, D

Fong, B. & Spivak, D. I.An Invitation to Applied Category Theory: Seven Sketches in Compositionality (Cambridge University Press, 2019)

2019

[20] [20]

Spivak, D. I. Functorial data migration.Information and Computation217, 31–51 (2012)

2012

[21] [21]

Spivak, D. I. Poly: An Abundant Categorical Setting for Mode-Dependent Dynamics (2020). URL https://arxiv.org/abs/2005.01894.2005.01894

arXiv 2020

[22] [22]

Spivak, D. I. Learners’ languages.arXiv preprint arXiv:2103.01189(2021)

arXiv 2021

[23] [23]

I., Giesa, T., Wood, E

Spivak, D. I., Giesa, T., Wood, E. & Buehler, M. J. Category theoretic analysis of hierarchical protein materials and social networks.PLoS ONE6, e23911 (2011)

2011

[24] [24]

Giesa, T., Spivak, D. I. & Buehler, M. J. Reoccurring patterns in hierarchical protein materials and music: The power of analogies.BioNanoScience1, 153–161 (2011)

2011

[25] [25]

Giesa, T., Spivak, D. I. & Buehler, M. J. Category theory based solution for the building block replacement problem in materials design.Advanced Engineering Materials14, 810–817 (2012)

2012

[26] [26]

Buehler, M. J. FieldPerceiver: Domain agnostic transformer model to predict multiscale physical fields and nonlinear material properties through neural ologs.Materials Today57, 9–25 (2022)

2022

[27] [27]

Buehler, M. J. From Atoms to Swarms: The Categorical Spine of Multiscale Materials Modeling and Autonomous Discovery (2026). URLhttps://doi.org/10.26434/chemrxiv.15002850/v1. Preprint, version 1

work page doi:10.26434/chemrxiv.15002850/v1 2026

[28] [28]

Goethe, J. W. v.Versuch die Metamorphose der Pflanzen zu erklaeren(Carl Wilhelm Ettinger, Gotha, 1790). English title: The Metamorphosis of Plants

[29] [29]

Cranford, S. W. & Buehler, M. J. Materiomics: Biological Protein Materials, from Nano to Macro. Nanotechnology, Science and Applications3, 127–148 (2010). URLhttps://doi.org/10.2147/NSA. S9037

work page doi:10.2147/nsa 2010

[30] [30]

Sampling-Based Risk-Aware Path Planning Around Dynamic Engagement Zones,

Lee, N. A., Shen, S. C. & Buehler, M. J. An Automated Biomateriomics Platform for Sustainable Programmable Materials Discovery.Matter5, 3597–3613 (2022). URLhttps://doi.org/10.1016/j. matt.2022.10.003

work page doi:10.1016/j 2022

[31] [31]

Fish, J., Wagner, G. J. & Keten, S. Mesoscopic and Multiscale Modelling in Materials.Nature Materials 20, 774–786 (2021). URLhttps://doi.org/10.1038/s41563-020-00913-0

work page doi:10.1038/s41563-020-00913-0 2021

[32] [32]

Buehler, M. J. & Genin, G. M. Integrated multiscale biomaterials experiment and modelling: a perspective. Interface Focus6, 20150098 (2016). URLhttp://rsfs.royalsocietypublishing.org/lookup/doi/ 10.1098/rsfs.2015.0098

work page doi:10.1098/rsfs.2015.0098 2016

[33] [33]

E., Webb, M

Jackson, N. E., Webb, M. A. & de Pablo, J. J. Recent Advances in Machine Learning Towards Multiscale Soft Materials Design.Current Opinion in Chemical Engineering23, 106–114 (2019). URL https://doi.org/10.1016/j.coche.2019.03.005

work page doi:10.1016/j.coche.2019.03.005 2019

[34] [34]

& Govindjee, S.Continuum Mechanics of Solids

Anand, L. & Govindjee, S.Continuum Mechanics of Solids. Oxford Graduate Texts (Oxford University Press, 2020)

2020

[35] [35]

Shi, M., Jiao, Q., Yin, T., Vlassak, J. J. & Suo, Z. Hydrolysis Embrittles Poly(lactic Acid).MRS Bulletin 48, 45–55 (2023). URLhttps://doi.org/10.1557/s43577-022-00368-5

work page doi:10.1557/s43577-022-00368-5 2023

[36] [36]

Tang, H., Buehler, M. J. & Moran, B. A constitutive model of soft tissue: from nanoscale collagen to tissue continuum.Annals of Biomedical Engineering37, 1117–1130 (2009)

2009

[37] [37]

Notions of computation and monads.Information and Computation93, 55–92 (1991)

Moggi, E. Notions of computation and monads.Information and Computation93, 55–92 (1991)

1991

[38] [38]

Tirion, M. M. Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Physical Review Letters77, 1905–1908 (1996). 29 Self-Revising Discovery Systems for Science

1905

[39] [39]

Bahar, I., Atilgan, A. R. & Erman, B. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential.Folding and Design2, 173–181 (1997)

1997

[40] [40]

& Erman, B

Haliloglu, T., Bahar, I. & Erman, B. Gaussian dynamics of folded proteins.Physical Review Letters79, 3090–3093 (1997)

1997

[41] [41]

A., Hage, T

Stewart, I. A., Hage, T. P., Hsu, Y.-C. & Buehler, M. J. GraphAgents: Knowledge Graph-Guided Agentic AI for Cross-Domain Materials Design.arXiv preprint arXiv:2602.07491(2026). URLhttps: //arxiv.org/abs/2602.07491.2602.07491

arXiv 2026

[42] [42]

P.et al.Mars: Hierarchical multi-agent reasoning systems enable knowledge-grounded material substitution (2026)

Hage, T. P.et al.Mars: Hierarchical multi-agent reasoning systems enable knowledge-grounded material substitution (2026)

2026

[43] [43]

Bacon, F.Novum Organum(Apud Joannem Billium, London, 1620)

[44] [44]

Parker, London, 1840)

Whewell, W.The Philosophy of the Inductive Sciences, Founded upon Their History(John W. Parker, London, 1840)

[45] [45]

Peirce, C. S. The fixation of belief.Popular Science Monthly12, 1–15 (1877)

[46] [46]

N.Science and the Modern World(The Macmillan Company, New York, 1925)

Whitehead, A. N.Science and the Modern World(The Macmillan Company, New York, 1925)

1925

[47] [47]

Polanyi, M.Personal Knowledge: Towards a Post-Critical Philosophy(University of Chicago Press, Chicago, 1958)

1958

[48] [48]

Hacking, I.Representing and Intervening: Introductory Topics in the Philosophy of Natural Science (Cambridge University Press, Cambridge, 1983)

1983

[49] [49]

W3C Recommendation, World Wide Web Consortium (W3C) (2013).https://www.w3.org/TR/2013/REC-prov-o-20130430/

Lebo, T.et al.PROV-O: The PROV ontology. W3C Recommendation, World Wide Web Consortium (W3C) (2013).https://www.w3.org/TR/2013/REC-prov-o-20130430/

2013

[50] [50]

Bechhofer, S.et al.Why linked data is not enough for scientists.Future Generation Computer Systems 29, 599–611 (2013)

2013

[51] [51]

Davidson, S. B. & Freire, J. Provenance and scientific workflows: challenges and opportunities. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 1345–1350 (Association for Computing Machinery, 2008)

2008

[52] [52]

Y.et al.Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge

Jaradeh, M. Y.et al.Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge. InProceedings of the 10th International Conference on Knowledge Capture (K-CAP ’19), 243–246 (Association for Computing Machinery, 2019)

2019

[53] [53]

Fong, B., Spivak, D. I. & Tuyéras, R. Backprop as functor: A compositional perspective on supervised learning. InProceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–13 (2019)

2019

[54] [54]

In Proceedings of the 41st International Conference on Machine Learning (ICML), 15209–15241 (2024)

Gavranović, B.et al.Position: Categorical deep learning is an algebraic theory of all architectures. In Proceedings of the 41st International Conference on Machine Learning (ICML), 15209–15241 (2024)

2024

[55] [55]

Cruttwell, G. S. H., Gavranović, B., Ghani, N., Wilson, P. W. & Zanasi, F. Deep learning with parametric lenses.arXiv preprint(2024). ArXiv:2404.00408

arXiv 2024

[56] [56]

Crescenzi, F. R. Towards a categorical foundation of deep learning: A survey.arXiv preprint(2024). ArXiv:2410.05353

arXiv 2024

[57] [57]

& Mendler, N

Aczel, P. & Mendler, N. A final coalgebra theorem. InCategory Theory and Computer Science, 357–365 (1989)

1989

[58] [58]

Rutten, J. J. M. M. Universal coalgebra: A theory of systems.Theoretical Computer Science249, 3–80 (2000)

2000

[59] [59]

Modeling by shortest data description.Automatica14, 465–471 (1978)

Rissanen, J. Modeling by shortest data description.Automatica14, 465–471 (1978)

1978

[60] [60]

D.The Minimum Description Length Principle(MIT Press, 2007)

Grünwald, P. D.The Minimum Description Length Principle(MIT Press, 2007)

2007

[61] [61]

Solomonoff, R. J. A formal theory of inductive inference, parts I and II.Information and Control7, 1–22, 224–254 (1964)

1964

[62] [62]

Hutter, M.Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability (Springer, 2005)

2005

[63] [63]

Estimating the dimension of a model.Annals of Statistics6, 461–464 (1978)

Schwarz, G. Estimating the dimension of a model.Annals of Statistics6, 461–464 (1978). 30 Self-Revising Discovery Systems for Science

1978

[64] [64]

O., Lehman, J

Stanley, K. O., Lehman, J. & Soros, L. Open-endedness: The last grand challenge you’ve never heard of. O’Reilly Online(2017)

2017

[65] [65]

& Stanley, K

Wang, R., Lehman, J., Clune, J. & Stanley, K. O. Paired open-ended trailblazer (POET): Endlessly generating increasingly complex and diverse learning environments and their solutions.arXiv preprint (2019). ArXiv:1901.01753

Pith/arXiv arXiv 2019

[66] [66]

X., Chen, C.-T

Gu, G. X., Chen, C.-T. & Buehler, M. J. De novo composite design based on machine learning algorithm. Extreme Mech. Lett18, 19–28 (2018)

2018

[67] [67]

E.et al.Physics-informed machine learning.Nature Reviews Physics3, 422–440 (2021)

Karniadakis, G. E.et al.Physics-informed machine learning.Nature Reviews Physics3, 422–440 (2021)

2021

[68] [68]

& Karniadakis, G

Lu, L., Jin, P., Pang, G., Zhang, Z. & Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence3, 218–229 (2021)

2021

[69] [69]

Kevrekidis, I. G.et al.Equation-free, coarse-grained multiscale computation: Enabling microscopic simulators to perform system-level analysis.Communications in Mathematical Sciences1, 715–762 (2003)

2003

[70] [70]

Brunton, Joshua L

Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proceedings of the National Academy of Sciences113, 3932–3937 (2016). URLhttps://www.pnas.org/doi/abs/10.1073/pnas.1517384113. https://www. pnas.org/doi/pdf/10.1073/pnas.1517384113

work page doi:10.1073/pnas.1517384113 2016

[71] [71]

URL https://arxiv.org/abs/2006.11287.2006.11287

Cranmer, M.et al.Discovering symbolic models from deep learning with inductive biases (2020). URL https://arxiv.org/abs/2006.11287.2006.11287

arXiv 2020

[72] [72]

Leinster, T.Higher Operads, Higher Categories(Cambridge University Press, 2004). 31 Self-Revising Discovery Systems for Science Supplementary Information Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence Fiona Y. Wang1,2 Markus J. Buehler2,3,4* 1Laboratory for Atomistic and Molecular Mechanics, MIT 2D...

2004