StableHLO serves as a viable unified representation for cross-architecture performance modeling of distributed ML workloads, preserving relative trends while exposing fidelity trade-offs.
Deepflow: A cross-stack pathfinding framework for distributed ai systems
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A roofline-based model is used to assess bandwidth and latency needs for High Bandwidth Storage in 13B-parameter models with long contexts and the utility of bonded memory chiplets for 1B-parameter models to ease capacity and bandwidth constraints in on-device gen-AI inference.
citing papers explorer
-
Evaluating Cross-Architecture Performance Modeling of Distributed ML Workloads Using StableHLO
StableHLO serves as a viable unified representation for cross-architecture performance modeling of distributed ML workloads, preserving relative trends while exposing fidelity trade-offs.
-
Technology solutions targeting the performance of gen-AI inference in resource constrained platforms
A roofline-based model is used to assess bandwidth and latency needs for High Bandwidth Storage in 13B-parameter models with long contexts and the utility of bonded memory chiplets for 1B-parameter models to ease capacity and bandwidth constraints in on-device gen-AI inference.