Asynchronous LLM function calling

· 2024 · arXiv 2412.07017

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents

cs.AI · 2026-05-21 · conditional · novelty 7.0

IdleSpec improves LLM agent accuracy by generating and aggregating speculative plans during idle time between tool calls and observations using complementary drafting strategies.

HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools

cs.AI · 2026-05-21 · unverdicted · novelty 5.0

HarnessAPI derives streaming HTTP endpoints, OpenAPI UI, and MCP tools from a single handler.py plus Pydantic schemas, cutting framework boilerplate by 74%.

Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs

cs.CL · 2026-05-14 · unverdicted · novelty 5.0

AsyncFC decouples LLM decoding from function execution via symbolic futures, enabling overlap and parallelism to reduce end-to-end latency on function-calling benchmarks while preserving accuracy.

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

cs.OS · 2025-11-04

citing papers explorer

Showing 4 of 4 citing papers.

IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents cs.AI · 2026-05-21 · conditional · none · ref 13
IdleSpec improves LLM agent accuracy by generating and aggregating speculative plans during idle time between tool calls and observations using complementary drafting strategies.
HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools cs.AI · 2026-05-21 · unverdicted · none · ref 13
HarnessAPI derives streaming HTTP endpoints, OpenAPI UI, and MCP tools from a single handler.py plus Pydantic schemas, cutting framework boilerplate by 74%.
Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs cs.CL · 2026-05-14 · unverdicted · none · ref 6
AsyncFC decouples LLM decoding from function execution via symbolic futures, enabling overlap and parallelism to reduce end-to-end latency on function-calling benchmarks while preserving accuracy.
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live cs.OS · 2025-11-04 · unreviewed · ref 24

Asynchronous LLM function calling

fields

years

verdicts

representative citing papers

citing papers explorer