{"paper":{"title":"ToolRosella: Translating Code Repositories into Standardized Tools for Scientific Agents","license":"http://creativecommons.org/licenses/by/4.0/","headline":"ToolRosella converts scientific code repositories into standardized, agent-invocable tools with 61.5 percent success after repair.","cross_cats":["cs.CE","cs.MA"],"primary_cat":"cs.SE","authors_text":"Chaoqian Ouyang, Hanghui Guo, Jian Yin, Jia Zhu, Libin Zheng, Ling Yue, Min-Ling Zhang, Shaowu Pan, Shimin Di, Xujie Yuan, Yong Rui, Yongxu Liu, Zhangze Chen","submitted_at":"2026-03-10T07:19:43Z","abstract_excerpt":"Large Language Model (LLM)-based agent systems are increasingly used for scientific tasks, yet their practical capability remains constrained by the narrow scope of manually curated tools they can invoke. Much scientific computational functionality already exists in open-source code repositories, but these resources remain difficult to standardize, operationalize, and invoke reliably for agent use. Here we present ToolRosella, a framework that automatically transforms heterogeneous scientific code repositories into standardized, agent-invocable tools. ToolRosella combines repository analysis, "},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"ToolRosella reaches a 61.5% repository conversion success rate after iterative repair, with a 4.4 speedup over human engineers. The resulting 1,580 callable tools support a downstream task success rate of 84.0% and improve performance when integrated into other agent frameworks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That automatic repository analysis, interface construction, execution testing, and iterative repair can reliably standardize heterogeneous scientific code without substantial loss of functionality or introduction of new errors across diverse domains.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"ToolRosella converts 122 scientific code repositories into 1,580 standardized tools at 61.5% success rate with 4.4x human speedup and 84% downstream agent task success.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"ToolRosella converts scientific code repositories into standardized, agent-invocable tools with 61.5 percent success after repair.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"1de5af3f2c240e61e12b207947c0dc0fcc4c3f063411cda43d506d0025440bf8"},"source":{"id":"2603.09290","kind":"arxiv","version":4},"verdict":{"id":"ad1be6da-c61b-4420-bb72-335e8d5c6b59","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T13:52:15.104433Z","strongest_claim":"ToolRosella reaches a 61.5% repository conversion success rate after iterative repair, with a 4.4 speedup over human engineers. The resulting 1,580 callable tools support a downstream task success rate of 84.0% and improve performance when integrated into other agent frameworks.","one_line_summary":"ToolRosella converts 122 scientific code repositories into 1,580 standardized tools at 61.5% success rate with 4.4x human speedup and 84% downstream agent task success.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That automatic repository analysis, interface construction, execution testing, and iterative repair can reliably standardize heterogeneous scientific code without substantial loss of functionality or introduction of new errors across diverse domains.","pith_extraction_headline":"ToolRosella converts scientific code repositories into standardized, agent-invocable tools with 61.5 percent success after repair."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2603.09290/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}