The authors derive the first bit-accurate arithmetic models for matrix multiply-accumulate operations on ten GPU architectures spanning NVIDIA Volta to Blackwell and AMD CDNA1 to CDNA3.
Title resolution pending
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Unsupervised GNN model learns local updates for approximate MaxIS on dynamic graphs, achieving competitive ratios on 200-1000 node instances and 1.00-1.18x larger solutions than other unsupervised models when generalizing to 100x larger graphs.
IntentTester migrates tests across libraries using TDL abstraction and multi-agent LLM synthesis, achieving 85% correctness and 74% effectiveness versus 51% and 43% for baselines on nine projects in JSON, HTML, and Time domains.
EnergAIzer predicts module-level GPU utilization from structured kernel patterns and feeds it into a power model to estimate dynamic power with 8% error on Ampere GPUs and 7% on H100 forecasts.
A unified comparative threat-modeling framework is developed to analyze security and privacy risks across virtual and robotic assistive systems.
citing papers explorer
-
Bit-Accurate Modeling of GPU Matrix Multiply-Accumulate Units: Demystifying Numerical Discrepancy and Accuracy
The authors derive the first bit-accurate arithmetic models for matrix multiply-accumulate operations on ten GPU architectures spanning NVIDIA Volta to Blackwell and AMD CDNA1 to CDNA3.
-
Unsupervised Learning of Local Updates for Maximum Independent Set in Dynamic Graphs
Unsupervised GNN model learns local updates for approximate MaxIS on dynamic graphs, achieving competitive ratios on 200-1000 node instances and 1.00-1.18x larger solutions than other unsupervised models when generalizing to 100x larger graphs.
-
IntentTester: Intent-Driven Multi-agent Framework for Cross-Library Test Migration
IntentTester migrates tests across libraries using TDL abstraction and multi-agent LLM synthesis, achieving 85% correctness and 74% effectiveness versus 51% and 43% for baselines on nine projects in JSON, HTML, and Time domains.
-
EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads
EnergAIzer predicts module-level GPU utilization from structured kernel patterns and feeds it into a power model to estimate dynamic power with 8% error on Ampere GPUs and 7% on H100 forecasts.
-
Security and Privacy in Virtual and Robotic Assistive Systems: A Comparative Framework
A unified comparative threat-modeling framework is developed to analyze security and privacy risks across virtual and robotic assistive systems.
- Ten-Four: An Open-Source Fused Dot Product Unit for Mixed-Precision GPGPU Tensor Cores