IdleSpec improves LLM agent accuracy by generating and aggregating speculative plans during idle time between tool calls and observations using complementary drafting strategies.
Asynchronous LLM function calling
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
HarnessAPI derives streaming HTTP endpoints, OpenAPI UI, and MCP tools from a single handler.py plus Pydantic schemas, cutting framework boilerplate by 74%.
AsyncFC decouples LLM decoding from function execution via symbolic futures, enabling overlap and parallelism to reduce end-to-end latency on function-calling benchmarks while preserving accuracy.
citing papers explorer
-
IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents
IdleSpec improves LLM agent accuracy by generating and aggregating speculative plans during idle time between tool calls and observations using complementary drafting strategies.
-
HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools
HarnessAPI derives streaming HTTP endpoints, OpenAPI UI, and MCP tools from a single handler.py plus Pydantic schemas, cutting framework boilerplate by 74%.
-
Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs
AsyncFC decouples LLM decoding from function execution via symbolic futures, enabling overlap and parallelism to reduce end-to-end latency on function-calling benchmarks while preserving accuracy.
- Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live