Dooly reduces LLM inference profiling costs by 56.4% via configuration-agnostic taint-based labeling and selective database reuse, delivering simulation accuracy within 5% MAPE for TTFT and 8% for TPOT across 12 models.
Gulavani, Alexey Tumanov, and Ramachandran Ramjee
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Agentic Witnessing enables privacy-preserving auditing of semantic properties in private data by running an LLM auditor in a TEE that answers binary queries and produces cryptographic transcripts of its reasoning.
Hive is a multi-agent infrastructure with a logits cache for reducing cross-path redundancy in sampling and agent-aware scheduling for better compute and KV-cache allocation, shown to deliver 1.11x-1.76x speedups and 33%-51% lower hotspot miss rates.
Agentic AI systems should be designed as marginal token allocators that balance benefit against cost, latency, and risk across their layers rather than as unit-priced text generators.
citing papers explorer
-
Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation
Dooly reduces LLM inference profiling costs by 56.4% via configuration-agnostic taint-based labeling and selective database reuse, delivering simulation accuracy within 5% MAPE for TTFT and 8% for TPOT across 12 models.
-
Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing
Agentic Witnessing enables privacy-preserving auditing of semantic properties in private data by running an LLM auditor in a TEE that answers binary queries and produces cryptographic transcripts of its reasoning.
-
Hive: A Multi-Agent Infrastructure for Algorithm- and Task-Level Scaling
Hive is a multi-agent infrastructure with a logits cache for reducing cross-path redundancy in sampling and agent-aware scheduling for better compute and KV-cache allocation, shown to deliver 1.11x-1.76x speedups and 33%-51% lower hotspot miss rates.
-
Agentic AI Systems Should Be Designed as Marginal Token Allocators
Agentic AI systems should be designed as marginal token allocators that balance benefit against cost, latency, and risk across their layers rather than as unit-priced text generators.