StepFly automates TSG execution via TSG Mentor, LLM-based DAG extraction with QPPs, and a DAG-guided parallel scheduler, reaching 94% success on GPT-4.1 with 32.9-70.4% time savings on parallelizable guides.
Toolformer: Language models can teach themselves to use tools
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
CURE-MED pairs a new 13-language medical reasoning benchmark with curriculum RL to raise logical correctness to 70% and language consistency to 95% at 32B scale while outperforming baselines.
An LLM agent integrated with AVEVA Process Simulation via MCP enables natural language driven flowsheet analysis, optimization, and construction for chemical separation processes.
NaviAgent decouples task planning from tool execution via a Tool World Navigation Model graph to improve scalability and success rates in LLM agents handling large tool ecosystems.
citing papers explorer
-
StepFly: Agentic Troubleshooting Guide Automation for Incident Diagnosis
StepFly automates TSG execution via TSG Mentor, LLM-based DAG extraction with QPPs, and a DAG-guided parallel scheduler, reaching 94% success on GPT-4.1 with 32.9-70.4% time savings on parallelizable guides.
-
CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning
CURE-MED pairs a new 13-language medical reasoning benchmark with curriculum RL to raise logical correctness to 70% and language consistency to 95% at 32B scale while outperforming baselines.
-
Large Language Model Agent for User-friendly Chemical Process Simulations
An LLM agent integrated with AVEVA Process Simulation via MCP enables natural language driven flowsheet analysis, optimization, and construction for chemical separation processes.
-
NaviAgent: Bilevel Planning on Tool Navigation Graph for Large-Scale Orchestration
NaviAgent decouples task planning from tool execution via a Tool World Navigation Model graph to improve scalability and success rates in LLM agents handling large tool ecosystems.