C"linkage. •task.yaml — metadata including anti-cheat blocked patterns (e.g., #include

Jiace Zhu, Wentao Chen, Qi Fan, Zhixing Ren, Junying Wu, Xing Zhe Chai, Chotiwit Rungrueangwutthinon, Yehan Ma, An Zou · 2026 · arXiv 2603.02236

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

CUDAHercules: Benchmarking Hardware-Aware Expert-level CUDA Optimization for LLMs

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

CUDAHercules benchmark demonstrates that leading LLMs generate functional CUDA code but fail to recover expert-level optimization strategies needed for peak performance on Ampere, Hopper, and Blackwell GPUs.

CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

CUDABeaver shows LLM CUDA debuggers often degenerate code for test-passing at the cost of speed, with protocol-aware metrics shifting success rates by up to 40 percentage points.

Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization

cs.CL · 2026-03-30 · unverdicted · novelty 6.0

Kernel-Smith combines evolutionary search with RL post-training to generate optimized GPU kernels, achieving SOTA speedups on KernelBench that beat Gemini-3.0-pro and Claude-4.6-opus on NVIDIA Triton and generalize to MetaX MACA.

citing papers explorer

Showing 3 of 3 citing papers.

CUDAHercules: Benchmarking Hardware-Aware Expert-level CUDA Optimization for LLMs cs.LG · 2026-05-08 · unverdicted · none · ref 30
CUDAHercules benchmark demonstrates that leading LLMs generate functional CUDA code but fail to recover expert-level optimization strategies needed for peak performance on Ampere, Hopper, and Blackwell GPUs.
CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging cs.LG · 2026-05-08 · unverdicted · none · ref 39
CUDABeaver shows LLM CUDA debuggers often degenerate code for test-passing at the cost of speed, with protocol-aware metrics shifting success rates by up to 40 percentage points.
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization cs.CL · 2026-03-30 · unverdicted · none · ref 35
Kernel-Smith combines evolutionary search with RL post-training to generate optimized GPU kernels, achieving SOTA speedups on KernelBench that beat Gemini-3.0-pro and Claude-4.6-opus on NVIDIA Triton and generalize to MetaX MACA.

C"linkage. •task.yaml — metadata including anti-cheat blocked patterns (e.g., #include

fields

years

verdicts

representative citing papers

citing papers explorer