StabilizerBench is a new benchmark for evaluating AI agents on generating, optimizing, and making fault-tolerant stabilizer circuits for quantum error correction, with efficient verification and multi-tier scoring.
Quanbench: Benchmarking quan- tum code generation with large language models
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5roles
background 1polarities
background 1representative citing papers
A taxonomy-guided RAG system with LLMs reduces hallucinations and improves migration suggestions for Qiskit code compared to unconstrained retrieval.
Adapts QuantumKatas to Qiskit yielding a 350-task benchmark across 26 categories and evaluates 16 LLMs in 39,200 runs, reporting performance gaps and prompting effects.
PennySynth raises pass@5 success on QHack quantum coding challenges by 25-28 points over a base LLM by retrieving from a curated PennyLane dataset using code-aware embeddings.
Iterative refinement boosts LLM success in generating quantum solvers that match classical results, but more advanced models shift from execution errors to hard-to-detect numerical inaccuracies.
citing papers explorer
-
StabilizerBench: A Benchmark for AI-Assisted Quantum Error Correction Circuit Synthesis
StabilizerBench is a new benchmark for evaluating AI agents on generating, optimizing, and making fault-tolerant stabilizer circuits for quantum error correction, with efficient verification and multi-tier scoring.
-
Qiskit Code Migration with LLMs
A taxonomy-guided RAG system with LLMs reduces hallucinations and improves migration suggestions for Qiskit code compared to unconstrained retrieval.
-
Qiskit QuantumKatas: Adapting Microsoft's Quantum Computing exercises for LLM evaluation
Adapts QuantumKatas to Qiskit yielding a 350-task benchmark across 26 categories and evaluates 16 LLMs in 39,200 runs, reporting performance gaps and prompting effects.
-
PennySynth: RAG-Driven Data Synthesis for Automated Quantum Code Generation
PennySynth raises pass@5 success on QHack quantum coding challenges by 25-28 points over a base LLM by retrieving from a curated PennyLane dataset using code-aware embeddings.
-
Can LLMs Solve Science or Just Write Code? Evaluating Quantum Solver Generation
Iterative refinement boosts LLM success in generating quantum solvers that match classical results, but more advanced models shift from execution errors to hard-to-detect numerical inaccuracies.