TIDAL recovers temporal phase signals from LLM-derived semantics of provisioning metadata to enable complementary CVD placement, reducing overload frequency by 79.1% on production traces.
Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems , pages =
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Privatar uses horizontal frequency partitioning and distribution-aware minimal perturbation to enable private offloading of VR avatar reconstruction, supporting 2.37x more users with modest overhead.
CROSS compiler maps HE workloads to TPU architecture via basis-aligned and memory-aligned transformations, reporting higher throughput-per-watt than prior GPU and ASIC libraries on NTT and HE operators.
KNT applies key-conditioned nonlinear obfuscation to split-inference features, cutting re-identification AUC from 0.635 to 0.586 with 0.15 ms overhead and under 1 pp accuracy loss.
Blink enables CPU-free LLM inference via SmartNIC offload and persistent GPU kernel, delivering up to 8.47x lower P99 TTFT, 3.4x lower P99 TPOT, 2.1x higher decode throughput, and 48.6% lower energy per token while remaining stable under CPU interference.
Modern benchmarks confirm that region-based custom allocators retain locality advantages over state-of-the-art general-purpose allocators, extending the original 2000 conclusions with new applications and fragmentation analysis.
citing papers explorer
-
TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics
TIDAL recovers temporal phase signals from LLM-derived semantics of provisioning metadata to enable complementary CVD placement, reducing overload frequency by 79.1% on production traces.
-
Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading
Privatar uses horizontal frequency partitioning and distribution-aware minimal perturbation to enable private offloading of VR avatar reconstruction, supporting 2.37x more users with modest overhead.
-
Leveraging ASIC AI Chips for Homomorphic Encryption
CROSS compiler maps HE workloads to TPU architecture via basis-aligned and memory-aligned transformations, reporting higher throughput-per-watt than prior GPU and ASIC libraries on NTT and HE operators.
-
Keyed Nonlinear Transform: Lightweight Privacy-Enhancing Feature Sharing for Medical Image Analysis
KNT applies key-conditioned nonlinear obfuscation to split-inference features, cutting re-identification AUC from 0.635 to 0.586 with 0.15 ms overhead and under 1 pp accuracy loss.
-
Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC
Blink enables CPU-free LLM inference via SmartNIC offload and persistent GPU kernel, delivering up to 8.47x lower P99 TTFT, 3.4x lower P99 TPOT, 2.1x higher decode throughput, and 48.6% lower energy per token while remaining stable under CPU interference.
-
Reconsidering "Reconsidering Custom Memory Allocation"
Modern benchmarks confirm that region-based custom allocators retain locality advantages over state-of-the-art general-purpose allocators, extending the original 2000 conclusions with new applications and fragmentation analysis.