This improves performance by reducing loop overhead and enhancing data locality through processing larger chunks of data per iteration

Stride calculations in load/store operations are adjusted accordingly Original code: ‘‘‘python v6 = nl

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization

cs.LG · 2025-11-19 · conditional · novelty 6.0

A self-improving LLM agent with optimization memory raises average kernel throughput from 45-49% to 59-61% of peak on Trainium accelerators and matches proprietary models at 26x lower cost.

citing papers explorer

Showing 1 of 1 citing paper.

AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization cs.LG · 2025-11-19 · conditional · none · ref 18
A self-improving LLM agent with optimization memory raises average kernel throughput from 45-49% to 59-61% of peak on Trainium accelerators and matches proprietary models at 26x lower cost.

This improves performance by reducing loop overhead and enhancing data locality through processing larger chunks of data per iteration

fields

years

verdicts

representative citing papers

citing papers explorer