Clarify Before You Draw: Proactive Agents for Robust Text-to-CAD Generation

Bin Hu; Bo Yuan; Petr Molodyk; Yongxin Chen; Zelin Zhao

arxiv: 2602.03045 · v2 · pith:TDPHGDNHnew · submitted 2026-02-03 · 💻 cs.LG

Clarify Before You Draw: Proactive Agents for Robust Text-to-CAD Generation

Bo Yuan , Zelin Zhao , Petr Molodyk , Bin Hu , Yongxin Chen This is my paper

classification 💻 cs.LG

keywords agentproactiveclarificationmodelsspecificationagenticambiguousbefore

0 comments

read the original abstract

Large language models have recently enabled text-to-CAD systems that synthesize parametric CAD programs (e.g., CadQuery) from natural-language prompts. In practice, however, geometric descriptions can be under-specified or internally inconsistent: critical dimensions may be missing and constraints may conflict. However, existing fine-tuned models tend to reactively follow the user instructions and hallucinate dimensions when the text is ambiguous. To address this, we propose a proactive agentic framework for text-to-CadQuery generation, named as ProCAD, that resolves specification issues before code synthesis. Our framework pairs a proactive clarifying agent, which audits the prompt and asks targeted clarification questions only when necessary to produce a self-consistent specification, with a CAD coding agent that translates the specification into an executable CadQuery program. We fine-tune the coding agent based on a curated high-quality text-to-CadQuery dataset and train the clarifying agent via agentic SFT on clarification trajectories. Experiments show that proactive clarification significantly improves robustness to ambiguous prompts while keeping interaction overhead low. ProCAD outperforms frontier closed-source models, including Claude Sonnet 4.5, reducing the mean Chamfer distance by 79.9% and lowering the invalidity ratio from 4.8% to 0.9%. Our code and datasets are made publicly available on https://github.com/BoYuanVisionary/Pro-CAD.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

IterCAD: An Iterative Multimodal Agent for Visually-Grounded CAD Generation and Editing
cs.AI 2026-06 unverdicted novelty 7.0

IterCAD introduces a closed-loop multimodal agent for CAD generation and editing, trained via progressive SFT and geometry-aware RL with viable-prefix masking, and evaluated on IterCAD-Bench using a new CD-TR curve an...
P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning
cs.CV 2026-06 unverdicted novelty 7.0

P3D-Bench is a benchmark with three task families that scores MLLMs on generating executable parametric 3D programs, finding failures in precise geometry and part assembly.