hub

InProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1(Rotterdam, Netherlands)(ASPLOS ’25)

Siddharth Jayashankar, Edward Chen, Tom Tang, Wenting Zheng, Dimitrios Skarlatos · 2025 · arXiv 9940.370726

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Latency Prediction for LLM Inference on NPU Systems

cs.DC · 2026-06-16 · unverdicted · novelty 7.0

LENS predicts NPU LLM inference latency with 2.15% mean error by profiling each bucket with two E2E measurements and composing results to capture bucketing non-linearity.

HexAGenT: Efficient Agentic LLM Serving via Workflow- and Heterogeneity-Aware Scheduling

cs.DC · 2026-05-15 · unverdicted · novelty 7.0

HexAGenT reduces the SLO scale required for timely agentic LLM workflow completion by an average of 20.1% at 95% attainment and 33.0% at 99% attainment on heterogeneous A100/H100/H200 clusters.

ReaLB: Real-Time Load Balancing for Multimodal MoE Inference

cs.DC · 2026-04-21 · unverdicted · novelty 7.0

ReaLB balances multimodal MoE inference loads by switching vision-heavy experts to lower FP4 precision per device rank, hiding the change in the dispatch phase to deliver 1.10-1.32x speedup with <1% accuracy degradation.

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

cs.PF · 2026-04-20 · unverdicted · novelty 7.0

HybridGen achieves 1.41x-3.2x average speedups over six prior KV cache methods for LLM inference by using attention logit parallelism, a feedback-driven scheduler, and semantic-aware KV cache mapping.

Optimism in Equality Saturation

cs.PL · 2025-11-25 · unverdicted · novelty 7.0

A new abstract interpretation algorithm enables sound optimistic analysis of e-graphs during equality saturation, unifying it with non-destructive rewriting and improving precision on cyclic SSA programs.

WHET: Welding Homomorphic Encryption to Accelerator Architectures

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

WHET applies fine-grained coefficient-to-slot transforms, plaintext compression, and modulus raising plus lightweight hardware tweaks to FHE accelerators, delivering 1.38-8.74x per-area gains and sub-millisecond CKKS bootstrapping.

ACALSim: A Scalable Parallel Simulation Framework for High-Performance System Design Space Exploration

cs.AR · 2026-05-21 · unverdicted · novelty 6.0

ACALSim is a new simulation framework with customizable threading, event-driven execution, and shared-memory model that reports over 14x speedup versus SST and enables simulation of large LLaMA models that SST cannot complete.

NasZip: Software and Hardware Co-Design to Accelerate Approximate Nearest Neighbor Search with DIMM-Based Near-Data Processing

cs.AR · 2026-05-21 · conditional · novelty 6.0

NasZip delivers up to 8.4x speedup over CPU baselines and 1.69x over prior NDP accelerators for ANNS by combining near-data processing with statistics-based PCA early exiting, dynamic-float encoding, and data-aware neighbor mapping.

Rewrite System Showdown: Stochastic Search vs. EqSat

cs.PL · 2026-05-18 · unverdicted · novelty 6.0 · 2 refs

Empirical comparison of equality saturation versus stochastic search on five benchmarks to evaluate if e-graphs are superior for rewrite-based optimization.

ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving

cs.LG · 2026-04-16 · unverdicted · novelty 6.0

ELMoE-3D achieves 6.6x average speedup and 4.4x energy efficiency gain for MoE serving on 3D hardware by scaling expert and bit elasticity for elastic self-speculative decoding.

AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems

cs.CR · 2026-04-03 · unverdicted · novelty 6.0

AEGIS reduces inter-GPU communication by up to 81.3% in self-attention and reaches 96.62% scaling efficiency with 3.86x speedup on four GPUs for 2048-token encrypted Transformer inference.

CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling

cs.DC · 2026-02-25 · unverdicted · novelty 6.0

CCCL delivers 1.34-1.94x faster cross-node GPU collectives via CXL memory pooling than 200 Gbps InfiniBand RDMA, with 1.11x LLM training speedup and 2.75x hardware cost reduction.

EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads

cs.AR · 2026-04-22 · unverdicted · novelty 5.0

EnergAIzer predicts module-level GPU utilization from structured kernel patterns and feeds it into a power model to estimate dynamic power with 8% error on Ampere GPUs and 7% on H100 forecasts.

Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR

cs.AR · 2025-11-27 · unverdicted · novelty 5.0

Aquas delivers a holistic hardware-software co-optimization framework on MLIR that models memory interfaces with cache effects and uses an e-graph retargetable compiler, achieving up to 15.61x speedup with 14.5% area overhead across four domains.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Optimism in Equality Saturation cs.PL · 2025-11-25 · unverdicted · none · ref 4
A new abstract interpretation algorithm enables sound optimistic analysis of e-graphs during equality saturation, unifying it with non-destructive rewriting and improving precision on cyclic SSA programs.
Rewrite System Showdown: Stochastic Search vs. EqSat cs.PL · 2026-05-18 · unverdicted · none · ref 4 · 2 links
Empirical comparison of equality saturation versus stochastic search on five benchmarks to evaluate if e-graphs are superior for rewrite-based optimization.

InProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1(Rotterdam, Netherlands)(ASPLOS ’25)

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer