Domain-specialized small language models enable deterministic atomic-resolution scanning probe microscopy control with 99.3% command accuracy, lower computational cost, and better domain performance than larger general models.
Towards understanding systems trade-offs in retrieval-augmented generation model inference.arXiv preprint arXiv:2412.11854, 2024
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
uGen is the first retrieval-augmented multi-agent LLM framework for generating functionally correct microarchitectural attack PoCs, reporting up to 100% success on Spectre-v1 and 80% on Prime+Probe at low cost.
CacheClip accelerates RAG prefill by up to 3.33x via auxiliary-model-guided selective KV recomputation while retaining 85-91% of full-attention quality on NIAH and LongBench.
GaiaFlow combines semantic-guided diffusion tuning with early-exit and quantization methods to lower carbon emissions in neural information retrieval while maintaining competitive effectiveness.
citing papers explorer
-
Integrating Domain-Specialized Language Models with AI Measurement Tools for Deterministic Atomic-Resolution Experimentation
Domain-specialized small language models enable deterministic atomic-resolution scanning probe microscopy control with 99.3% command accuracy, lower computational cost, and better domain performance than larger general models.
-
uGen: An Agentic Framework for Generating Microarchitectural Attack PoCs
uGen is the first retrieval-augmented multi-agent LLM framework for generating functionally correct microarchitectural attack PoCs, reporting up to 100% success on Spectre-v1 and 80% on Prime+Probe at low cost.
-
CacheClip: Accelerating RAG with Effective KV Cache Reuse
CacheClip accelerates RAG prefill by up to 3.33x via auxiliary-model-guided selective KV recomputation while retaining 85-91% of full-attention quality on NIAH and LongBench.
-
GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search
GaiaFlow combines semantic-guided diffusion tuning with early-exit and quantization methods to lower carbon emissions in neural information retrieval while maintaining competitive effectiveness.