{"total":25,"items":[{"citing_arxiv_id":"2605.28655","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation","primary_cat":"cs.AI","submitted_at":"2026-05-27T15:56:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Decentralized AI agent teams self-organize around hypotheses, critique proposals, and share knowledge to outperform single-agent baselines on biomedical ML, language-model optimization, and protein fitness tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.27571","ref_index":19,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Discovery Agents for Real-Time Analytics: Toward Proactive Insight Systems","primary_cat":"cs.AI","submitted_at":"2026-05-26T18:43:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Describes a multi-agent system for proactive insight discovery in real-time streaming data via LLMs and streaming platforms.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.23204","ref_index":89,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery","primary_cat":"cs.AI","submitted_at":"2026-05-22T03:40:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A survey organizing AI-powered research automation into five workflow stages, defining AutoResearch and Vibe Research, and proposing five evaluation dimensions while noting domain-conditioned limits on autonomy.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"alongside early work onAutoresearcher[58]. In 2025,Idea2Paper[85],Agent Laboratory[22],AlphaEvolve[86], DeepScientist[87],CodeScientist[52], andOmniScientist[51] further expanded this pipeline view through paper generation, coding, experiment management, multi-agent ecosystems, or longer-horizon research production.AI Scientist-v2[10],AI-Researcher[45],InternAgent[88], andKosmos[89] strengthened this frontier through agentic search, experiment management, persistent research state, and tighter coupling between literature, hypothesis gen- eration, data analysis, and scientific reporting. By 2026, open infrastructures such asNanoResearch[25],Research- Claw[90],ScienceClaw[91],AutoResearchClaw[92],ARIS[24], andEvoScientist[93] further shifted the field from"},{"citing_arxiv_id":"2605.22681","ref_index":42,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Forecasting Scientific Progress with Artificial Intelligence","primary_cat":"cs.AI","submitted_at":"2026-05-21T16:23:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces the CUSP benchmark across 4760 events and finds frontier AI models can pick plausible directions but fail to predict whether or when scientific advances will occur, with performance varying by domain and insensitive to training cutoffs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.22343","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators","primary_cat":"cs.MA","submitted_at":"2026-05-21T11:29:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Sibyl-AutoResearch introduces self-evolving trial-and-error harnesses with auditable conversion units that link trial signals to updated research behaviors and harness repairs in autonomous systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.21825","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Toward AI VIS Co-Scientists: A General and End-to-End Agent Harness for Solving Complex Data Visualization Tasks","primary_cat":"cs.AI","submitted_at":"2026-05-20T23:49:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A multi-agent harness autonomously generates functional single-page VIS apps with linked views for scientific data tasks using coordinated skills for analysis, planning, implementation, and evaluation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18661","ref_index":133,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"AI for Auto-Research: Roadmap & User Guide","primary_cat":"cs.AI","submitted_at":"2026-05-18T17:08:26+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The paper delivers a stage-by-stage roadmap for AI in research, showing reliable assistance in retrieval and tool tasks but fragility in novelty and judgment, advocating human-governed collaboration.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Sequential systems connect research stages in a mostly linear order, typically moving from idea generation to experiment execution and manuscript drafting. The AI Scientist [122] established this paradigm by demonstrating that hypothesis generation, code execution, experimental analysis, and paper writing can be as- sembled into a single automated workflow. Agent Laboratory [171], AI-Researcher [199], CycleResearcher [220], Kosmos [133], Dolphin [244], CodeScientist [77], and InternAgent [248] instantiate related pipeline designs with different choices of base models, task scopes, and evaluation targets. The advantage of sequential architectures is operational simplicity. Each stage produces an artifact that becomes the input to the next stage, making the workflow interpretable and relatively easy to implement."},{"citing_arxiv_id":"2605.18831","ref_index":23,"ref_count":2,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Towards Discovery of Polymers for Insulin Delivery via Physics-Grounded Agentic Workflows","primary_cat":"q-bio.QM","submitted_at":"2026-05-12T22:16:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An LLM-orchestrated physics simulation search identifies polymers with strong insulin interactions, outperforming standard optimization methods by significant margins.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11258","ref_index":29,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Unlocking LLM Creativity in Science through Analogical Reasoning","primary_cat":"cs.AI","submitted_at":"2026-05-11T21:35:44+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.","context_count":1,"top_context_role":"other","top_context_polarity":"unclear","context_text":"Journal of Memory and Language, 145:104676, December 2025. ISSN 0749-596X. doi: 10.1016/j.jml.2025.104676. URL http://dx.doi.org/10.1016/j.jml. 2025.104676. [28] Minh Nhat Nguyen, Andrew Baker, Clement Neo, Allen Roush, Andreas Kirsch, and Ravid Shwartz-Ziv. Turning up the heat: Min-p sampling for creative and coherent llm outputs, 2025. URLhttps://arxiv.org/abs/2407.01082. [29] Phanish Puranam, Prothit Sen, and Maciej Workiewicz. Can llms help improve analogical reasoning for strategic decisions? experimental evidence from humans and gpt-4, 2025. URL https://arxiv.org/abs/2505.00603. [30] Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, and Shafiq Joty. Relevant or random: Can llms truly perform analogical reasoning?"},{"citing_arxiv_id":"2605.07022","ref_index":47,"ref_count":6,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Self-Driving Datasets: From 20 Million Papers to Nuanced Biomedical Knowledge at Scale","primary_cat":"cs.LG","submitted_at":"2026-05-07T23:08:18+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":2,"top_context_role":"background","top_context_polarity":"background","context_text":"URL https://doi.org/10.1038/ s41597-024-03793-0. [47] Motasem S Obeidat, Md Sultan Al Nahian, and Ramakanth Kavuluru. Do LLMs Surpass Encoders for Biomedical NER?Proceedings. IEEE International Conference on Healthcare Informatics, 2025:352-358, June 2025. ISSN 2575-2626. doi: 10.1109/ICHI64645.2025.00048. URLhttps://pmc.ncbi.nlm.nih.gov/articles/PMC12335919/. [48] Guillaume Ollitrault, Marco Marzo, Alessandra Roncaglioni, Emilio Benfenati, Olivier Taboureau, and Enrico Mombelli. Qsar models for predicting oral bioavailability and vol- ume of distribution and their application in mapping the tk space of endocrine disrup- tors.Journal of Xenobiotics, 15(5):166, 2025. doi: 10.3390/jox15050166. URL https: //doi."},{"citing_arxiv_id":"2605.06651","ref_index":46,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"AI co-mathematician: Accelerating mathematicians with agentic AI","primary_cat":"cs.AI","submitted_at":"2026-05-07T17:56:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"An interactive AI workbench for mathematicians achieves 48% on FrontierMath Tier 4 and helped solve open problems in early tests.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05921","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Intentmaking and Sensemaking: Human Interaction with AI-Guided Mathematical Discovery","primary_cat":"cs.AI","submitted_at":"2026-05-07T09:30:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Expert mathematicians using an AI coding agent for discovery engage in repeated cycles of intentmaking to define goals and sensemaking to interpret outputs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05851","ref_index":5,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Hypothesis generation and updating in large language models","primary_cat":"cs.LG","submitted_at":"2026-05-07T08:24:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01489","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"SciResearcher: Scaling Deep Research Agents for Frontier Scientific Reasoning","primary_cat":"cs.AI","submitted_at":"2026-05-02T15:26:45+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Roberts, Sladjana Zagorac, Timothy C. Orr, Miranda E. Orr, Kevin J. Zwezdaryk, Ali E. Ghareeb, Laurie McCoy, Bruna Gomes, Euan A. Ashley, Karen E. Duff, Tonio Buonassisi, Tom Rainforth, Randall J. Bateman, Michael Skarlinski, Samuel G. Rodriques, Michaela M. Hinks, and Andrew D. White. Kosmos: An ai scientist for autonomous discovery, 2025. URL https: //arxiv.org/abs/2511.02824. [26] Nayantara Mudur, Hao Cui, Subhashini Venugopalan, Paul Raccuglia, Michael P. Brenner, and Peter Norgaard. Feabench: Evaluating language models on multiphysics reasoning ability, 2025. URL https: //arxiv.org/abs/2504.06260. [27] OpenAI. Introducing deep research. https://openai.com/index/introducing-deep-research/, February 2025. [28] Perplexity AI."},{"citing_arxiv_id":"2604.25610","ref_index":26,"ref_count":2,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Optimizing ground state preparation protocols with autoresearch","primary_cat":"quant-ph","submitted_at":"2026-04-28T13:18:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AI coding agents evolve simple ground-state protocols into improved versions for VQE, DMRG, and AFQMC on spin models and molecules by using executable energy scores under fixed compute budgets.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"arXiv:2412.07978, 2024.doi:10.48550/arXiv.2412.07978. [25] Ludovico Mitchener, Angela Yiu, Benjamin Chang, Mathieu Bourdenx, Tyler Nadolski, Arvis Sulovari, Eric C Landsness, Daniel L Barabasi, Siddharth Narayanan, Nicky Evans, et al. Kosmos: An ai scientist for autonomous discovery.arXiv preprint arXiv:2511.02824, 2025.doi:10.48550/arXiv.2511.02824. [26] Sören Arlt, Xuemei Gu, and Mario Krenn. Towards autonomous quantum physics research using llm agents with access to intelligent tools.arXiv preprint arXiv:2511.11752, 2025.doi:10.48550/arXiv.2511.11752. [27] Mario Krenn, Yehonathan Drori, and Rana X Adhikari. Digital discovery of interferometric gravitational wave detectors.Physical Review X, 15(2):021012, 2025."},{"citing_arxiv_id":"2604.18805","ref_index":20,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"AI scientists produce results without reasoning scientifically","primary_cat":"cs.AI","submitted_at":"2026-04-20T20:23:42+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LLM agents execute scientific tasks but fail to follow core scientific reasoning norms such as evidence consideration and belief revision based on refutations.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"arately for ReAct and tool-calling agents. Rows correspond to interventions with traces from successful baseline runs (top two) and traces from failed baseline runs (bottom two), with columns representing each environment. Unlike Pass@k, Pass∧k requires allk trials to succeed. 28 spectra / level_1 / trial 20 Evidence non-uptake E J informs × Obs. obtain_isomers_from_molecu- lar_formula: [ ... Agent Let me recalculate the DBE for C18H14O2: DBE = (2×18 + 2 - 14)/2 = (36 + 2 - 14)/2 = 24/2 = 12 A DBE of 12 is very hi... simulate_spectra(...) spectra / level_1 / trial 22 Untested claim H T tests × Agent Great! The proton NMR gives me valuable information: - 2.23 ppm (s, 3H) - singlet with 3 protons, likely a CH3 group ... mass_spectrometry_spectra(...) Obs. mass_spectrometry_spectra: m/z 224.12011513525002 (intensity 100), 225.1235025219052 (intensity 17.52727899573804)... md / level_2 / trial 33 Fixed belief trace H1 Hn updates_to × Agent The '/potentials/' directory con- tains a file named 'SW', which likely corresponds to the Stillinger-W eber potential. ... write_file(...) ... Agent The error \"Lost atoms: original 8 current 0\" indicates that during the simulation, atoms have moved out- side the simul... write_file(...) spectra / level_2 / trial 45 Contradiction without repair J H contradicts no updates_to Agent The simulated spectra for CC(C)OC(=O)c1ccccc1N(C)C shows: - H-NMR: 1.46 ppm (d, 6H) - This is for isopropyl CH3 group... get_formula_from_smiles(...) Agent The molecular formula for CC(C)OC(=O)c1ccccc1N(C)C is C12H17NO2. Looking at the carbon NMR, I count 11 distinct signa... Extended Data Fig. 4: Autonomous agents repeatedly fail to correct flawed rea- soning: they ignore accessible evidence, leave hypotheses untested, cling to false beliefs, and accept contradictions.Each panel lists model, environment, scope, and trial index. Evidence non-uptake (top left): Agent retrieves 20 isomers (including the correct one) but never consults the list, guessi"},{"citing_arxiv_id":"2604.15411","ref_index":5,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research","primary_cat":"cs.LG","submitted_at":"2026-04-16T16:22:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"PRL-Bench evaluates frontier LLMs on 100 real physics research tasks and finds the best models score below 50, exposing a gap to autonomous discovery.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16294","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Are Researchers Being Replaced by Artificial Intelligence?","primary_cat":"cs.CY","submitted_at":"2026-04-14T19:07:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"AI is shifting researchers from creators to curators of generated content, risking loss of intellectual ownership and genuine understanding of science.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Data pollution introduces technical failure modes that are easily missed by humans. End-to-end pipelines can produce polished outputs while masking invalid evaluation caused by benchmark selection, data leakage, metric misuse, and post-hoc selection bias. However, the stronger replacement claim depends on the capacity fornovelty and skill-acquisition[ 2], specifically whether AI can reliably propose hypotheses beyond mere recombination, and subject them to valid evaluation. A longer-term technical risk is synthetic self-reinforcement. Increasing reliance on AI-generated manuscripts, code, and summaries as training material may narrow the diversity of future model outputs. In the extreme, recursive training on"},{"citing_arxiv_id":"2603.09970","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"CREATE: Testing LLMs for Associative Creativity","primary_cat":"cs.CL","submitted_at":"2026-03-10T17:58:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CREATE is a benchmark that scores LLMs on their ability to produce many specific and diverse associative paths between concepts drawn from parametric knowledge.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.09590","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"DeepReviewer 2.0: A Traceable Agentic System for Auditable Scientific Peer Review","primary_cat":"cs.AI","submitted_at":"2026-03-03T09:02:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An agentic system produces traceable review packages and an un-finetuned 196B model using it covers more major issues than Gemini-3.1-Pro on 134 ICLR 2025 submissions while winning most blind comparisons to human committees.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2603.03295","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Language Model Goal Selection Differs from Humans' in a Self-Directed Learning Task","primary_cat":"cs.CL","submitted_at":"2026-02-06T15:39:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLMs diverge from human goal selection in self-directed learning by exploiting single solutions with low variability across instances.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.04850","ref_index":37,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"El Agente Quntur: A research collaborator agent for quantum chemistry","primary_cat":"physics.chem-ph","submitted_at":"2026-02-04T18:38:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"El Agente Quntur is a new multi-agent system that uses reasoning over literature and software documentation to autonomously handle the full workflow of quantum chemistry experiments in ORCA.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"org/abs/2501.06590 (2025). [35] Yang, Z.et al.MOOSE-Chem: Large language models for rediscovering unseen chemistry scientific hypotheses. Preprint at https://arxiv.org/abs/2410.07076 (2025). [36] Yamada, Y.et al.The AI scientist-v2: Workshop-level automated scientific discovery via agentic tree search. Preprint at https://arxiv.org/abs/2504.08066 (2025). [37] Mitchener, L.et al.Kosmos: An AI scientist for autonomous discovery Preprint at https://arxiv.org/abs/2511.02824 (2025). [38] Liu, W.et al.MOOSE-Chem3: Toward experiment-guided hypothesis ranking via simulated experimental feedback. Preprint at https://arxiv.org/abs/2505.17873 (2025). [39] Li, Z.et al.ChemHAS: Hierarchical agent stacking for enhancing chemistry tools."},{"citing_arxiv_id":"2512.15930","ref_index":45,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Scalable Agentic Reasoning for Designing Biologics Targeting Intrinsically Disordered Proteins","primary_cat":"q-bio.QM","submitted_at":"2025-12-17T19:55:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"StructBioReasoner is a scalable multi-agent system that designs IDP-targeting biologics, with over 50% of 787 candidates for Der f 21 showing better binding free energy than human-designed references.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.15567","ref_index":33,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Evaluating Large Language Models in Scientific Discovery","primary_cat":"cs.AI","submitted_at":"2025-12-17T16:20:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"The SDE benchmark shows LLMs lag on scientific discovery tasks relative to general science tests, with diminishing scaling returns and shared weaknesses across models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.01089","ref_index":16,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"CodeDistiller: Automatically Generating Code Libraries for Scientific Coding Agents","primary_cat":"cs.AI","submitted_at":"2025-11-30T21:19:10+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CodeDistiller distills 250 materials-science GitHub repositories into vetted code libraries that improve the accuracy and scientific soundness of experiments generated by ASD agents.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}