hub Canonical reference

Barbarians at the gate: How AI is upending systems research

Audrey Cheng, Shu Liu, Melissa Pan, Zhifei Li, Bowen Wang, Alex Krentsel, Tian Xia, Mert Cemri, Jongseok Park, Shuo Yang, Jeff Chen, Lakshya Agrawal, Aditya Desai, Jiarong Xing, Koushik Sen, Matei Zaharia, Ion Stoica · 2025 · arXiv 2510.06189

Canonical reference. 75% of citing Pith papers cite this work as background.

13 Pith papers citing it

Background 75% of classified citations

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7 baseline 1

citation-polarity summary

background 6 baseline 1 support 1

representative citing papers

What Do Evolutionary Coding Agents Evolve?

cs.NE · 2026-05-19 · unverdicted · novelty 7.0

Evolutionary coding agents achieve most benchmark gains through a small subset of edit types and by cycling previously deleted code lines rather than developing new algorithmic structures.

SemaTune: Semantic-Aware Online OS Tuning with Large Language Models

cs.OS · 2026-05-14 · unverdicted · novelty 7.0

SemaTune uses LLM guidance with semantic context to tune up to 41 Linux OS parameters, delivering 72.5% performance gains over defaults and 153.3% over non-LLM baselines on 13 workloads while avoiding degraded states.

CCL-Bench 1.0: A Trace-Based Benchmark for LLM Infrastructure

cs.DC · 2026-05-07 · unverdicted · novelty 7.0

CCL-Bench packages traces and metadata to compute detailed compute, memory, and communication efficiency metrics, surfacing performance insights unavailable from end-to-end benchmarks.

Autopoiesis: A Self-Evolving System Paradigm for LLM Serving Under Runtime Dynamics

cs.DC · 2026-04-08 · unverdicted · novelty 7.0

Autopoiesis uses LLM-driven program synthesis to evolve serving policies online during deployment, delivering up to 53% and average 34% gains over prior LLM serving systems under runtime dynamics.

DWDP: Distributed Weight Data Parallelism for High-Performance LLM Inference on NVL72

cs.DC · 2026-04-02 · unverdicted · novelty 7.0

DWDP distributes MoE weights across GPUs for independent execution without collective synchronization, improving output TPS/GPU by 8.8 percent on GB200 NVL72 for DeepSeek-R1 under 8K input and 1K output lengths.

Both Ends Count! Just How Good are LLM Agents at "Text-to-Big SQL"?

cs.DB · 2026-02-25 · unverdicted · novelty 7.0

New Text-to-Big SQL metrics show that LLM agents must balance accuracy with cost and speed at scale, where GPT-4o trades some accuracy for up to 12x speedup and GPT-5.2 proves more cost-effective than Gemini 3 Pro on large inputs.

optimize_anything: A Universal API for Optimizing any Text Parameter

cs.CL · 2026-05-19 · unverdicted · novelty 6.0

A universal LLM optimizer for text artifacts achieves SOTA results on six tasks including tripling ARC-AGI accuracy and cutting cloud costs by 40% via cross-task transfer and side information.

Agentic Architect: An Agentic AI Framework for Architecture Design Exploration and Optimization

cs.AI · 2026-04-28 · accept · novelty 6.0

An LLM-driven agentic system evolves microarchitectural policies for cache replacement, data prefetching, and branch prediction, producing designs that match or exceed prior state-of-the-art in IPC on standard benchmarks.

Record-Remix-Replay: Hierarchical GPU Kernel Optimization using Evolutionary Search

cs.DC · 2026-04-13 · unverdicted · novelty 6.0

R^3 optimizes full scientific applications on GPUs better than tuning kernel parameters or compiler flags alone while running nearly an order of magnitude faster than modern evolutionary search methods.

AI-Driven Research for Databases

cs.DB · 2026-04-08 · unverdicted · novelty 6.0

Co-evolving LLM-generated solutions with their evaluators enables discovery of novel database algorithms that outperform state-of-the-art baselines, including a query rewrite policy with up to 6.8x lower latency.

Computer Architecture's AlphaZero Moment: Automated Discovery in an Encircled World

cs.AR · 2026-03-31 · conditional · novelty 6.0

Automated architectural discovery engines can outperform human design teams by exploring massive design spaces and compressing development cycles from months to weeks.

PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

PACEvolve++ uses a phase-adaptive reinforcement learning advisor to decouple hypothesis selection from execution in LLM-driven evolutionary search, delivering faster convergence than prior frameworks on load balancing, recommendation, and protein tasks.

Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

cs.AI · 2026-05-20

citing papers explorer

Showing 13 of 13 citing papers.

What Do Evolutionary Coding Agents Evolve? cs.NE · 2026-05-19 · unverdicted · none · ref 36
Evolutionary coding agents achieve most benchmark gains through a small subset of edit types and by cycling previously deleted code lines rather than developing new algorithmic structures.
SemaTune: Semantic-Aware Online OS Tuning with Large Language Models cs.OS · 2026-05-14 · unverdicted · none · ref 18
SemaTune uses LLM guidance with semantic context to tune up to 41 Linux OS parameters, delivering 72.5% performance gains over defaults and 153.3% over non-LLM baselines on 13 workloads while avoiding degraded states.
CCL-Bench 1.0: A Trace-Based Benchmark for LLM Infrastructure cs.DC · 2026-05-07 · unverdicted · none · ref 12
CCL-Bench packages traces and metadata to compute detailed compute, memory, and communication efficiency metrics, surfacing performance insights unavailable from end-to-end benchmarks.
Autopoiesis: A Self-Evolving System Paradigm for LLM Serving Under Runtime Dynamics cs.DC · 2026-04-08 · unverdicted · none · ref 25
Autopoiesis uses LLM-driven program synthesis to evolve serving policies online during deployment, delivering up to 53% and average 34% gains over prior LLM serving systems under runtime dynamics.
DWDP: Distributed Weight Data Parallelism for High-Performance LLM Inference on NVL72 cs.DC · 2026-04-02 · unverdicted · none · ref 4
DWDP distributes MoE weights across GPUs for independent execution without collective synchronization, improving output TPS/GPU by 8.8 percent on GB200 NVL72 for DeepSeek-R1 under 8K input and 1K output lengths.
Both Ends Count! Just How Good are LLM Agents at "Text-to-Big SQL"? cs.DB · 2026-02-25 · unverdicted · none · ref 11
New Text-to-Big SQL metrics show that LLM agents must balance accuracy with cost and speed at scale, where GPT-4o trades some accuracy for up to 12x speedup and GPT-5.2 proves more cost-effective than Gemini 3 Pro on large inputs.
optimize_anything: A Universal API for Optimizing any Text Parameter cs.CL · 2026-05-19 · unverdicted · none · ref 7
A universal LLM optimizer for text artifacts achieves SOTA results on six tasks including tripling ARC-AGI accuracy and cutting cloud costs by 40% via cross-task transfer and side information.
Agentic Architect: An Agentic AI Framework for Architecture Design Exploration and Optimization cs.AI · 2026-04-28 · accept · none · ref 12
An LLM-driven agentic system evolves microarchitectural policies for cache replacement, data prefetching, and branch prediction, producing designs that match or exceed prior state-of-the-art in IPC on standard benchmarks.
Record-Remix-Replay: Hierarchical GPU Kernel Optimization using Evolutionary Search cs.DC · 2026-04-13 · unverdicted · none · ref 13
R^3 optimizes full scientific applications on GPUs better than tuning kernel parameters or compiler flags alone while running nearly an order of magnitude faster than modern evolutionary search methods.
AI-Driven Research for Databases cs.DB · 2026-04-08 · unverdicted · none · ref 12
Co-evolving LLM-generated solutions with their evaluators enables discovery of novel database algorithms that outperform state-of-the-art baselines, including a query rewrite policy with up to 6.8x lower latency.
Computer Architecture's AlphaZero Moment: Automated Discovery in an Encircled World cs.AR · 2026-03-31 · conditional · none · ref 9
Automated architectural discovery engines can outperform human design teams by exploring massive design spaces and compressing development cycles from months to weeks.
PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents cs.LG · 2026-05-07 · unverdicted · none · ref 8
PACEvolve++ uses a phase-adaptive reinforcement learning advisor to decouple hypothesis selection from execution in LLM-driven evolutionary search, delivering faster convergence than prior frameworks on load balancing, recommendation, and protein tasks.
Declarative Data Services: Structured Agentic Discovery for Composing Data Systems cs.AI · 2026-05-20 · unreviewed · ref 56

Barbarians at the gate: How AI is upending systems research

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer