Ion Stoica
Identifiers
No identifiers captured yet.
Papers (53)
- Uncovering Intra-expert Activation Sparsity for Efficient Mixture-of-Expert Model Execution cs.LG · 2026 · author #4
- Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP cs.DC · 2026 · author #9
- ClawEnvKit: Automatic Environment Generation for Claw-Like Agents cs.AI · 2026 · author #3
- UCCL-Zip: Lossless Compression Supercharged GPU Communication cs.DC · 2026 · author #9
- Foundry: Template-Based CUDA Graph Context Materialization for Fast LLM Serving Cold Start cs.DC · 2026 · author #5
- AI-Driven Research for Databases cs.DB · 2026 · author #8
- Combee: Scaling Prompt Learning for Self-Improving Language Model Agents cs.AI · 2026 · author #13
- M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling cs.LG · 2026 · author #3
- Flash-KMeans: Fast and Memory-Efficient Exact K-Means cs.DC · 2026 · author #13
- Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization cs.LG · 2026 · author #14
- Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live cs.OS · 2025 · author #10
- RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs cs.DC · 2025 · author #8
- GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning cs.CL · 2025 · author #14
- Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation cs.CV · 2025 · author #13
- Why Do Multi-Agent LLM Systems Fail? cs.AI · 2025 · author #13
- RouteLLM: Learning to Route LLMs with Preference Data cs.LG · 2024 · author #8
- LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code cs.SE · 2024 · author #10
- Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference cs.AI · 2024 · author #11
- SGLang: Efficient Execution of Structured Language Model Programs cs.AI · 2023 · author #9
- MemGPT: Towards LLMs as Operating Systems cs.AI · 2023 · author #6
- Efficient Memory Management for Large Language Model Serving with PagedAttention cs.LG · 2023 · author #9
- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena cs.CL · 2023 · author #13
- Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules cs.CV · 2019 · author #3
- Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection cs.DC · 2019 · author #6
- Neural Packet Classification cs.NI · 2019 · author #4
- Cloud Programming Simplified: A Berkeley View on Serverless Computing cs.OS · 2019 · author #13
- The OoO VLIW JIT Compiler for GPU Inference cs.DC · 2019 · author #6
- DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching cs.DC · 2019 · author #8
- AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning cs.PL · 2019 · author #5
- Dynamic Space-Time Scheduling for GPU Inference cs.DC · 2018 · author #8
- numpywren: serverless linear algebra cs.DC · 2018 · author #6
- Learning to Optimize Join Queries With Deep Reinforcement Learning cs.DB · 2018 · author #5
- Tune: A Research Platform for Distributed Model Selection and Training cs.LG · 2018 · author #6
- Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning cs.LG · 2018 · author #3
- NetChain: Scale-Free Sub-RTT Coordination (Extended Version) cs.DC · 2018 · author #8
- RLlib: Abstractions for Distributed Reinforcement Learning cs.AI · 2017 · author #9
- Ray: A Distributed Framework for Emerging AI Applications cs.DC · 2017 · author #11
- A Berkeley View of Systems Challenges for AI cs.AI · 2017 · author #1
- DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations cs.RO · 2017 · author #3
- Multi-Level Discovery of Deep Options cs.LG · 2017 · author #3
- Real-Time Machine Learning: The Missing Pieces cs.DC · 2017 · author #10
- Occupy the Cloud: Distributed Computing for the 99% cs.DC · 2017 · author #4
- Clipper: A Low-Latency Online Prediction Serving System cs.DC · 2016 · author #6
- Fast and Accurate Performance Analysis of LTE Radio Access Networks cs.DC · 2016 · author #2
- SparkNet: Training Deep Networks in Spark stat.ML · 2015 · author #3
- Asynchronous Complex Analytics in a Distributed Dataflow Architecture cs.DB · 2015 · author #7
- GraphX: Unifying Data-Parallel and Graph-Parallel Analytics cs.DB · 2014 · author #6
- Coordination Avoidance in Database Systems (Extended Version) cs.DB · 2014 · author #6
- Highly Available Transactions: Virtues and Limitations (Extended Version) cs.DB · 2013 · author #6
- Shark: SQL and Rich Analytics at Scale cs.DB · 2012 · author #6
- Probabilistically Bounded Staleness for Practical Partial Quorums cs.DB · 2012 · author #5
- BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data cs.DB · 2012 · author #5
- Faster and More Accurate Sequence Alignment with SNAP cs.DS · 2011 · author #7
Mentions
No mention provenance yet.
Frequent Coauthors
- Joseph E. Gonzalez 17 shared papers
- Michael I. Jordan 6 shared papers
- Michael J. Franklin 6 shared papers
- Eric Liang 5 shared papers
- Joseph M. Hellerstein 5 shared papers
- Ken Goldberg 5 shared papers
- Matei Zaharia 5 shared papers
- Peter Bailis 5 shared papers
- Philipp Moritz 5 shared papers
- Robert Nishihara 5 shared papers
- Alexey Tumanov 4 shared papers
- Ali Ghodsi 4 shared papers
- Kurt Keutzer 4 shared papers
- Lianmin Zheng 4 shared papers
- Richard Liaw 4 shared papers
- Xin Jin 4 shared papers
- Yilong Zhao 4 shared papers
- Ying Sheng 4 shared papers
- Alvin Cheung 3 shared papers
- Chenfeng Xu 3 shared papers