SPoC: Search-based Pseudocode to Code

Alex Aiken; Kartik Chandra; Mina Lee; Oded Padon; Panupong Pasupat; Percy Liang; Sumith Kulal

arxiv: 1906.04908 · v1 · pith:WPL7IZL7new · submitted 2019-06-12 · 💻 cs.LG · cs.CL· cs.PL· stat.ML

SPoC: Search-based Pseudocode to Code

Sumith Kulal , Panupong Pasupat , Kartik Chandra , Mina Lee , Oded Padon , Alex Aiken , Percy Liang This is my paper

classification 💻 cs.LG cs.CLcs.PLstat.ML

keywords pseudocodeprogramprogramssearchassignmentcasescodecredit

0 comments

read the original abstract

We consider the task of mapping pseudocode to long programs that are functionally correct. Given test cases as a mechanism to validate programs, we search over the space of possible translations of the pseudocode to find a program that passes the validation. However, without proper credit assignment to localize the sources of program failures, it is difficult to guide search toward more promising programs. We propose to perform credit assignment based on signals from compilation errors, which constitute 88.7% of program failures. Concretely, we treat the translation of each pseudocode line as a discrete portion of the program, and whenever a synthesized program fails to compile, an error localization method tries to identify the portion of the program responsible for the failure. We then focus search over alternative translations of the pseudocode for those portions. For evaluation, we collected the SPoC dataset (Search-based Pseudocode to Code) containing 18,356 programs with human-authored pseudocode and test cases. Under a budget of 100 program compilations, performing search improves the synthesis success rate over using the top-one translation of the pseudocode from 25.6% to 44.7%.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Guidelines for Empirical Studies in Software Engineering involving Large Language Models
cs.SE 2025-08 accept novelty 7.0

The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
cs.CL 2021-04 conditional novelty 7.0

Presents the NATURAL INSTRUCTIONS meta-dataset and shows generative pre-trained language models achieve 19% better generalization to unseen tasks when using task instructions.
Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation
cs.AI 2025-10 unverdicted novelty 6.0

A Dirichlet-prior Bayesian estimator for model success probability replaces Pass@k, delivering faster-converging and more stable rankings with credible intervals on math benchmarks.
Guidelines for Empirical Studies in Software Engineering involving Large Language Models
cs.SE 2025-08 accept novelty 6.0

A group of 22 researchers proposes seven study types and eight guidelines for empirical software engineering studies involving LLMs to enhance reproducibility and replicability.
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
cs.LG 2024-07 unverdicted novelty 6.0

Repeated sampling scales problem coverage log-linearly with sample count, improving SWE-bench Lite performance from 15.9% to 56% using 250 samples.