CCL-Bench packages traces and metadata to compute detailed compute, memory, and communication efficiency metrics, surfacing performance insights unavailable from end-to-end benchmarks.
LLMCompass: Enabling efficient hardware design for large language model inference
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ELMoE-3D achieves 6.6x average speedup and 4.4x energy efficiency gain for MoE serving on 3D hardware by scaling expert and bit elasticity for elastic self-speculative decoding.
citing papers explorer
-
CCL-Bench 1.0: A Trace-Based Benchmark for LLM Infrastructure
CCL-Bench packages traces and metadata to compute detailed compute, memory, and communication efficiency metrics, surfacing performance insights unavailable from end-to-end benchmarks.
-
ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving
ELMoE-3D achieves 6.6x average speedup and 4.4x energy efficiency gain for MoE serving on 3D hardware by scaling expert and bit elasticity for elastic self-speculative decoding.