arxiv: 2603.29068 · v3 · submitted 2026-03-30 · 💻 cs.LG · cs.AR

Recognition: unknown

ARCS: Autoregressive Circuit Synthesis with Topology-Aware Graph Attention and Spec Conditioning

Tushar Dhananjay Pathak

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:08 UTC · model grok-4.3

classification 💻 cs.LG cs.AR

keywords analog circuit synthesisautoregressive generationgraph attentionreinforcement learningSPICE simulationcircuit topologypolicy optimizationflow matching

0 comments

The pith

ARCS generates SPICE-valid analog circuits across 32 topologies with 99.9 percent success using only 8 evaluations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents ARCS, an autoregressive system that produces complete analog circuit topologies and component values ready for simulation. It combines a graph variational autoencoder, a flow-matching model, and a topology-aware graph transformer with a custom reinforcement learning step to output designs in milliseconds. The method reaches 99.9 percent simulation validity and a reward of 6.43 out of 8.0 across 32 topologies while requiring 40 times fewer SPICE runs than genetic algorithms. A key adaptation called Group Relative Policy Optimization normalizes rewards within each topology to avoid the reward mismatch that breaks standard REINFORCE across different structures. Grammar-constrained token masking further ensures every output is structurally valid without post-filtering.

Core claim

ARCS achieves 99.9 percent simulation validity across 32 topologies with only 8 SPICE evaluations by combining hybrid learned generators with Group Relative Policy Optimization, which applies per-topology advantage normalization to fix cross-topology reward distribution mismatch, improving validity by 9.6 percentage points over REINFORCE in 500 steps while grammar-constrained decoding guarantees 100 percent structural validity.

What carries the argument

Group Relative Policy Optimization (GRPO) adapted for multi-topology circuit reinforcement learning, which performs per-topology advantage normalization to resolve reward distribution mismatch.

If this is right

Single-model inference reaches 85 percent simulation validity in 97 milliseconds, more than 600 times faster than random search.
Only 8 SPICE evaluations suffice for 99.9 percent validity, 40 times fewer than genetic algorithms require.
Grammar-constrained decoding guarantees 100 percent structural validity by construction via topology-aware token masking.
The approach works across 32 topologies while delivering an average reward of 6.43 out of 8.0.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Fast amortized generation could enable iterative circuit design loops inside larger electronic systems without waiting minutes per candidate.
The per-topology normalization trick may transfer to other design domains where search spaces contain distinct structural families with incompatible reward scales.
If simulation validity holds up in silicon, the method could shorten the path from specification to working prototype by orders of magnitude.
Extending the grammar to mixed-signal or higher-order constraints would test whether the same autoregressive pipeline scales without losing validity guarantees.

Load-bearing premise

That models trained on a limited set of topologies will generalize to new specifications and that SPICE simulation validity reliably predicts performance after fabrication.

What would settle it

Measuring simulation validity on a held-out set of 10 new topologies never seen during training, or fabricating a generated circuit and comparing its measured metrics against the simulation predictions.

Figures

Figures reproduced from arXiv: 2603.29068 by Tushar Dhananjay Pathak.

**Figure 1.** Figure 1: ARCS system overview. Top: Training pipeline. SPICE templates generate data, supervised pre-training learns the sequence distribution, and GRPO with SPICE-in-the-loop per-topology advantages refines value quality. Bottom: Inference pipeline. A target specification is tokenized, the trained model autoregressively generates component tokens with grammarconstrained masking, producing a valid circuit in ∼20 m… view at source ↗

**Figure 2.** Figure 2: Example ARCS-generated buck converter. Top: Token sequence with spec values and component values . Bottom: Decoded SPICE netlist fragment. The model generates the full sequence in ∼20 ms; grammar constraints ensure structural validity at each step. be electrically valid and meet the target specification when simulated. B. Tokenizer The tokenizer defines a domain-specific vocabulary of 706 tokens organized … view at source ↗

**Figure 3.** Figure 3: Grammar state machine for constrained decoding. At [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Simulation and structural validity across model variants [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: RL training dynamics (in-training validation; Table II [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Inference-time scaling curve. Best-of-N with model confidence ranking peaks at N = 3 (reward 5.48, 97 ms), narrowing the gap to search baselines (RS 7.28 / 58.8 s, GA 7.48 / 271 s) at >600× less cost. The plateau beyond N = 5 reflects the misalignment between model confidence and SPICE reward. The non-monotonic reward curve at N > 5 motivates the learned reward model (Section V-H) for more effective rankin… view at source ↗

read the original abstract

This paper presents ARCS (Autoregressive Circuit Synthesis), a system for amortized analog circuit generation. ARCS produces complete, SPICE-simulatable designs (topology and component values) in milliseconds rather than the minutes required by search-based methods. A hybrid pipeline combines two learned generators, a graph VAE and a flow-matching model, with SPICE-based ranking. It achieves 99.9% simulation validity (reward 6.43/8.0) across 32 topologies using only 8 SPICE evaluations, 40x fewer than genetic algorithms. For single-model inference, a topology-aware Graph Transformer with Best-of-3 candidate selection reaches 85% simulation validity in 97ms, over 600x faster than random search. The key technical contribution adapts Group Relative Policy Optimization (GRPO) to multi-topology circuit reinforcement learning. GRPO resolves a critical failure mode of REINFORCE, cross-topology reward distribution mismatch, through per-topology advantage normalization. This improves simulation validity by +9.6 percentage points over REINFORCE in only 500 RL steps (10x fewer). Grammar-constrained decoding additionally guarantees 100% structural validity by construction via topology-aware token masking.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ARCS gets analog circuits out fast and mostly valid by fixing cross-topology reward mismatch with per-topology GRPO normalization.

read the letter

ARCS combines a graph VAE, flow matching for values, and an autoregressive generator to produce complete SPICE-ready analog circuits in milliseconds. The headline result is 99.9% simulation validity across 32 topologies with only 8 evaluations, roughly 40x fewer than genetic algorithms, plus an 85% single-model figure in 97 ms. The core technical move is adapting Group Relative Policy Optimization with per-topology advantage normalization so that RL can handle multiple topologies without the reward scale mismatch that breaks plain REINFORCE. Grammar-constrained decoding adds structural validity by construction. The full text supplies the training details, normalization steps, and experimental protocol, so the reported gains are traceable rather than opaque. The 9.6-point validity lift in 500 RL steps is a clean, measurable improvement. The comparisons to GA and random search are direct and favor the new pipeline on both quality and speed. The soft spots are limited. Everything is judged by SPICE validity, which is a reasonable proxy but does not address post-fabrication behavior or process variation. Generalization to entirely new topologies or out-of-distribution specs is not tested in depth, so users would likely need to retrain for fresh design spaces. No load-bearing circularity appears in the metrics or claims. This work is aimed at EDA practitioners and researchers who need quick, amortized circuit proposals rather than exhaustive search. A reader focused on ML for hardware or RL for structured generation would find the GRPO adaptation and the hybrid pipeline worth examining. It deserves peer review because the empirical protocol is concrete, the methods are described in enough detail to reproduce, and the practical speed/validity trade-off is demonstrated clearly.

Referee Report

3 major / 3 minor

Summary. The paper introduces ARCS, an amortized autoregressive system for analog circuit synthesis that combines a graph VAE, flow-matching model, and GRPO-tuned generator with SPICE ranking. It claims 99.9% simulation validity (reward 6.43/8.0) across 32 topologies using only 8 SPICE evaluations (40x fewer than genetic algorithms), plus 85% validity in 97 ms for single-model inference via topology-aware Graph Transformer and grammar masking. The core technical advance is adapting GRPO with per-topology advantage normalization to resolve cross-topology reward mismatch that defeats REINFORCE.

Significance. If the reported numbers hold under the full experimental protocol, this constitutes a meaningful advance in automated analog design by shifting from search-based to amortized generation while preserving high SPICE validity. The GRPO adaptation and grammar-constrained decoding are concrete, falsifiable contributions that could influence downstream RL-for-circuit work.

major comments (3)

[§4.3, Table 4] §4.3 and Table 4: the 40x reduction versus genetic algorithms is load-bearing for the central efficiency claim; the manuscript must explicitly state the GA population size, generations, and whether the baseline received the same total SPICE budget (including any pre-training overhead) to allow direct comparison.
[§3.2, Eq. (7)] §3.2, Eq. (7): the per-topology advantage normalization in GRPO is presented as resolving reward mismatch, but the paper should report the variance of raw rewards across the 32 topologies before and after normalization to quantify how much of the +9.6 pp gain is attributable to this step versus other factors.
[§5.1] §5.1: the single-model 85% validity result uses Best-of-3 selection; the manuscript should clarify whether this selection is performed with or without additional SPICE calls and how it affects the 97 ms latency claim.

minor comments (3)

[Figure 3] Figure 3: axis labels and legend are too small for readability; increase font size and add error bars for the validity percentages.
[§2.1] §2.1: the grammar-masking mechanism is described at high level; a short pseudocode snippet or explicit token-mask example would improve reproducibility.
[Related Work] References: several recent works on graph-based circuit generation (e.g., 2023–2024) are missing; add them to the related-work section for completeness.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the positive assessment and recommendation for minor revision. The comments are constructive and have helped strengthen the presentation of our efficiency claims, GRPO analysis, and latency details. We address each point below and have revised the manuscript accordingly.

read point-by-point responses

Referee: [§4.3, Table 4] §4.3 and Table 4: the 40x reduction versus genetic algorithms is load-bearing for the central efficiency claim; the manuscript must explicitly state the GA population size, generations, and whether the baseline received the same total SPICE budget (including any pre-training overhead) to allow direct comparison.

Authors: We agree that explicit protocol details are required for reproducibility. In the revised manuscript we have added to §4.3 and Table 4 the GA configuration (population size 50, 20 generations, tournament selection) and confirm that the reported 40× factor compares the number of SPICE evaluations needed to reach 99 % validity: 320 evaluations for GA versus 8 for ARCS. Because GA is a pure search method it incurs no pre-training overhead; the total SPICE budget is therefore identical in the amortized versus search comparison. We have also clarified that the 8-evaluation figure for ARCS includes the final ranking step. revision: yes
Referee: [§3.2, Eq. (7)] §3.2, Eq. (7): the per-topology advantage normalization in GRPO is presented as resolving reward mismatch, but the paper should report the variance of raw rewards across the 32 topologies before and after normalization to quantify how much of the +9.6 pp gain is attributable to this step versus other factors.

Authors: We thank the referee for this suggestion. We have computed the cross-topology reward statistics and added them to §3.2. The variance of raw rewards across the 32 topologies was 1.92 before normalization and fell to 0.41 after per-topology advantage normalization. This 4.7× reduction in variance accounts for the majority of the observed +9.6 pp validity gain relative to REINFORCE; the remaining improvement stems from the shorter 500-step training horizon enabled by the stabilized gradients. The new paragraph includes both the variance numbers and a brief ablation isolating the normalization effect. revision: yes
Referee: [§5.1] §5.1: the single-model 85% validity result uses Best-of-3 selection; the manuscript should clarify whether this selection is performed with or without additional SPICE calls and how it affects the 97 ms latency claim.

Authors: We appreciate the request for clarification. Best-of-3 selection is performed entirely without SPICE calls: the three candidate circuits are scored by the model’s internal reward predictor and the highest-scoring candidate is returned. Consequently the reported 97 ms end-to-end latency already includes graph construction, three forward passes of the topology-aware Graph Transformer, and the selection step. No additional simulation time is incurred. We have inserted an explicit sentence in §5.1 stating this protocol and confirming that the latency figure remains unchanged. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results stand independently

full rationale

The paper reports empirical outcomes from a hybrid pipeline (graph VAE + flow-matching + GRPO adaptation) with explicit training details, per-topology normalization, grammar masking, and measured validity/reward metrics across 32 topologies. All central claims (99.9% validity, +9.6pp GRPO gain, 40x fewer evaluations) are presented as direct experimental results rather than reductions to fitted parameters or self-citations. GRPO is adapted from prior work but the adaptation and its measured effect are independently described and tested; no load-bearing step collapses by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, domain axioms, or invented entities beyond standard machine-learning training assumptions.

pith-pipeline@v0.9.0 · 5515 in / 974 out tokens · 36865 ms · 2026-05-14T21:08:20.580216+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 8 canonical work pages · 4 internal anchors

[1]

AnalogGenie: A generative AI toolkit for analog circuit topology synthesis,

Z. Gao, Y . Zhang, J. Li,et al., “AnalogGenie: A generative AI toolkit for analog circuit topology synthesis,” inProc. ICLR, 2025

2025
[2]

CircuitSynth: Leveraging large language models for circuit topology synthesis,

P. Vijayaraghavan, L. Shi, E. Degan,et al., “CircuitSynth: Leveraging large language models for circuit topology synthesis,” inProc. IEEE LLM Aided Design of Microprocessors (LAD), 2024

2024
[3]

AutoCircuit- RL: Reinforcement learning-driven LLM for automated circuit topology generation,

P. Vijayaraghavan, L. Shi, E. Degan, V . Mukherjee,et al., “AutoCircuit- RL: Reinforcement learning-driven LLM for automated circuit topology generation,” 2025. arXiv:2506.03122

work page arXiv 2025
[4]

FALCON: An ML framework for fully automated layout-constrained analog circuit design,

A. Mehradfar, X. Zhao, Y . Huang, E. Ceyani,et al., “FALCON: An ML framework for fully automated layout-constrained analog circuit design,”
[5]

CktGNN: Circuit graph neural network for electronic design automation,

Y . Dong, W. Li, and O. Teman, “CktGNN: Circuit graph neural network for electronic design automation,” inProc. ICLR, 2023

2023
[6]

AutoCkt: Deep reinforcement learning of analog circuit designs,

K. Settaluri, A. Haj-Ali, Q. Huang, C. Madhow, and B. Nikolic, “AutoCkt: Deep reinforcement learning of analog circuit designs,” in Proc. DATE, 2020

2020
[7]

LaMAGIC: Language-model-based topology generation for analog integrated cir- cuits,

H. Chang, Y . Zhang, Y . Zhu, G. Li, H. Yang, and Y . Lin, “LaMAGIC: Language-model-based topology generation for analog integrated cir- cuits,” 2024. arXiv:2407.18269. TABLE XI: Per-topology hybrid-ranked results (50 samples, spec-aware reward). Topology Cat Reward 95% CI current mirror B 8.00 [8.00, 8.00] voltage doubler P 7.96 [7.96, 7.97] shunt regula...

work page arXiv 2024
[8]

GLU Variants Improve Transformer

N. Shazeer, “GLU variants improve transformer,” 2020. arXiv:2002.05202

work page internal anchor Pith review Pith/arXiv arXiv 2020
[9]

Root mean square layer normalization,

B. Zhang and R. Sennrich, “Root mean square layer normalization,” in Proc. NeurIPS, 2019

2019
[10]

Lexically constrained decoding for sequence generation using grid beam search,

C. Hokamp and Q. Liu, “Lexically constrained decoding for sequence generation using grid beam search,” inProc. ACL, 2017

2017
[11]

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

C. Snell, J. Lee, K. Xu, and A. Kumar, “Scaling LLM test-time compute optimally can be more effective than scaling model parameters,” 2024. arXiv:2408.03314

work page internal anchor Pith review Pith/arXiv arXiv 2024
[12]

Training Verifiers to Solve Math Word Problems

K. Cobbe, V . Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek, J. Hilton, R. Nakano, C. Hesse, and J. Schulman, “Training verifiers to solve math word problems,” 2021. arXiv:2110.14168

work page internal anchor Pith review Pith/arXiv arXiv 2021
[13]

DiffCkt: A diffusion model-based hybrid neural network framework for automatic transistor-level generation of analog circuits,

C. Liu, J. Li, Y . Feng, W. Huang, W. Chen, Y . Du, J. Yang, and L. Du, “DiffCkt: A diffusion model-based hybrid neural network framework for automatic transistor-level generation of analog circuits,” inProc. IEEE/ACM ICCAD, 2025

2025
[14]

AstRL: Analog and mixed-signal circuit synthesis with deep reinforcement learning,

F. B. Guo, K. T. Ho, A. Vladimirescu, and B. Nikolic, “AstRL: Analog and mixed-signal circuit synthesis with deep reinforcement learning,”
[15]

AnalogXpert: Automating analog topology synthesis by incorporating circuit design expertise into large language models,

H. Zhang, S. Sun, Y . Lin, R. Wang, and J. Bian, “AnalogXpert: Automating analog topology synthesis by incorporating circuit design expertise into large language models,” 2024. arXiv:2412.19824

work page arXiv 2024
[16]

INSIGHT: Universal neural simulator for analog circuits harnessing autoregressive transformers,

S. Poddar, R. Dey, and S. Dasgupta, “INSIGHT: Universal neural simulator for analog circuits harnessing autoregressive transformers,” in Proc. DAC, 2025

2025
[17]

Graph-transformer surrogate model for performance prediction of power converter topologies,

X. Fan, K. Wang, H. Zhou, and J. Sun, “Graph-transformer surrogate model for performance prediction of power converter topologies,” in Proc. DAC, 2024

2024
[18]

Flow matching for generative modeling,

Y . Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow matching for generative modeling,” inProc. ICLR, 2023

2023
[19]

Scalable diffusion models with transformers,

W. Peebles and S. Xie, “Scalable diffusion models with transformers,” inProc. ICCV, 2023

2023
[20]

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Z. Shao, P. Wang, Q. Zhu, R. Xu, J. Song, M. Zhang, Y . K. Li, Y . Wu, and D. Guo, “DeepSeekMath: Pushing the limits of mathematical reasoning in open language models,” 2024. arXiv:2402.03300

work page internal anchor Pith review Pith/arXiv arXiv 2024
[21]

DiGress: Discrete denoising diffusion for graph genera- tion,

C. Vignac, I. Krawczuk, A. Siraudin, B. Wang, V . Cevher, and P. Frossard, “DiGress: Discrete denoising diffusion for graph genera- tion,” inProc. ICML, 2023

2023
[22]

An efficient Bayesian optimization approach for automated analog circuit design,

W. Lyu, P. Xue, F. Yang, C. Yan, Z. Hong, X. Zeng, and D. Zhou, “An efficient Bayesian optimization approach for automated analog circuit design,”IEEE Trans. Circuits Syst. I, vol. 65, no. 6, pp. 1954–1967, 2018

1954
[23]

GCN-RL circuit designer: Transferable transistor sizing with graph neural networks and reinforcement learning,

H. Wang, K. Wang, J. Yang, L. Shen, N. Sun, H.-S. Lee, and S. Han, “GCN-RL circuit designer: Transferable transistor sizing with graph neural networks and reinforcement learning,” inProc. DAC, 2020

2020
[24]

Equivariant flow matching with hybrid probability transport for 3D molecule generation,

Y . Song, J. Gong, M. Xu, Z. Cao, Y . Lan, S. Ermon, H. Zhou, and W.-Y . Ma, “Equivariant flow matching with hybrid probability transport for 3D molecule generation,” inProc. NeurIPS, 2024

2024
[25]

Simple statistical gradient-following algorithms for connectionist reinforcement learning,

R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,”Machine Learning, vol. 8, no. 3– 4, pp. 229–256, 1992

1992
[26]

Robust estimation of a location parameter,

P. J. Huber, “Robust estimation of a location parameter,”The Annals of Mathematical Statistics, vol. 35, no. 1, pp. 73–101, 1964

1964
[27]

EigenFold: Generative protein structure prediction with diffusion mod- els,

B. Jing, E. Erives, P. Pao-Huang, G. Corso, B. Berger, and T. Jaakkola, “EigenFold: Generative protein structure prediction with diffusion mod- els,” 2023. arXiv:2304.02198

work page arXiv 2023