pith. sign in

arxiv: 2606.27122 · v1 · pith:JBONOHUXnew · submitted 2026-06-25 · 💻 cs.PL · cs.MA· cs.SE

Mostly Automatic Translation of Language Interpreters from C to Safe Rust

Pith reviewed 2026-06-26 01:28 UTC · model grok-4.3

classification 💻 cs.PL cs.MAcs.SE
keywords C to Rust translationinterpreter translationsafe Rustfeature reductionmulti-agent translationmemory safetyautomatic program translationvulnerability elimination
0
0 comments X

The pith

Reboot translates C interpreters to safe Rust by breaking the task into testable feature milestones validated automatically, needing only 1-11 brief human fixes per program.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Reboot to convert real C interpreters into memory-safe Rust versions. It decomposes each program by features so the process moves through complete, runnable milestones that can be checked before adding the next layer of complexity. A multi-agent system then coordinates coding steps with automated tests and feedback to stay on track. The result is shown on six interpreters up to 23k lines long, all passing their original test suites fully and most of a separate validation suite, while removing the original memory safety problems. Readers would care if the method scales because interpreters often process untrusted input and are frequent sources of exploitable bugs.

Core claim

Reboot uses feature reduction to split translation into a chain of milestones, each a working program that is tested before the next feature is restored, together with a multi-agent architecture that routes unreliable coding agents through repeated validation and correction loops. This combination produces safe Rust translations of six C interpreters (6k-23k lines) after 1-11 short interventions, all passing the supplied test suites at 100 percent and 62-92 percent on unseen validation tests, with a case study confirming removal of heap buffer overflows and use-after-free bugs.

What carries the argument

feature reduction, which decomposes translation into a sequence of complete, testable program milestones that start simple and incrementally restore original features while validating each step

If this is right

  • Translating an interpreter requires only 1-11 brief interventions rather than a full manual rewrite.
  • Memory vulnerabilities present in the C source are absent from the resulting safe Rust code.
  • Feature reduction raises validation-suite pass rates by 6-20 percent compared with multi-agent translation alone.
  • All six translated interpreters pass every test in the original suites.
  • The same workflow applies across interpreters whose sizes range from 6k to 23k lines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The milestone approach could be tried on other classes of C programs that are not interpreters, provided similar feature orderings can be identified.
  • The multi-agent validation loop might reduce human effort in other large-scale code-generation tasks that currently rely on unreliable models.
  • If validation tests remain hidden from the system, their pass rates give a practical measure of how well the translation generalizes beyond the original test suites.

Load-bearing premise

The supplied test suites together with the separately written validation tests are enough to guarantee that the translated code behaves correctly and contains no memory safety errors on all inputs that matter.

What would settle it

An input that makes the Rust version produce a different observable result from the original C version, or that triggers a memory error in the C version but is still accepted by the Rust version.

Figures

Figures reproduced from arXiv: 2606.27122 by Bo Wang, Brandon Paulsen, Daniel Kroening, Joey Dodds, Prateek Saxena, Umang Mathur.

Figure 1
Figure 1. Figure 1: The workflow of the translation process using [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example of changes in source code as well as the test suite across feature levels during translation. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of the regex compilation state in C ( [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Condensed system logs (simplified from real trajectories) showing [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Branch-level workflow control for the Simplification phase. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Branch-level workflow control for the Translation phase. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Three-level implementation architecture of [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Code size comparison across feature levels. The charts show lines of code (LoC) growth as features [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Cost breakdown by activity across six interpreter programs. The pie charts show the distribution of [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Performance evaluation results comparing C baseline with Rust (Reboot) implementations. All [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: C code test coverage trend across feature levels. The charts show the percentage of C code covered [PITH_FULL_IMAGE:figures/full_fig_p028_11.png] view at source ↗
read the original abstract

Translating C programs to safe Rust is challenging owing to significant differences in typing constraints, ownership, and borrowing rules. Interpreter programs are particularly important targets for such translation, as they often handle untrusted inputs and suffer from memory-related vulnerabilities. We present Reboot, a mostly-automatic technique that translates real-world interpreter programs from C to safe Rust. Using Reboot, we have translated six interpreters ranging from 6k to 23k lines of C code to safe Rust, with each translation requiring only 1 to 11 brief user interventions. All translations pass 100% of the provided test suites, and achieve 62%--92% pass rates on separately created validation tests that were never exposed to the system. A security case study on mujs shows that memory vulnerabilities such as heap buffer overflows and use-after-free present in C are eliminated in the safe Rust translation. Two ideas underpin Reboot. First, feature reduction decomposes the translation by program features, creating a sequence of milestones where each is a complete, testable program; the translation starts from the simplest version and incrementally restores features, with each milestone validated before proceeding. Second, a multi-agent architecture orchestrates inherently unreliable coding agents through automated validation and feedback, keeping long-running translation workflows on track with minimal human involvement. An ablation study confirms that feature reduction improves translation correctness compared to using multi-agent translation alone, with 6%--20% improvements in pass rates on validation test suites.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper presents Reboot, a mostly-automatic technique for translating C language interpreters to safe Rust. It relies on feature reduction (decomposing the program into a sequence of complete, testable milestones) and a multi-agent architecture (orchestrating coding agents with automated validation and feedback). The authors report translating six real-world interpreters (6k–23k LOC) with only 1–11 brief user interventions each; all translations pass 100% of the supplied test suites and 62%–92% of separately created validation tests, with an ablation confirming the value of feature reduction and a mujs security case study showing elimination of specific memory vulnerabilities such as heap buffer overflows and use-after-free.

Significance. If the reported translations preserve functional behavior on all relevant inputs, the work would be a notable engineering contribution to automated migration of security-critical C code to memory-safe languages. The scale of the evaluated programs, the low intervention counts, the concrete ablation results (6%–20% validation-pass-rate gains), and the use of held-out tests are strengths that distinguish it from purely manual or fully automatic approaches. The multi-agent feedback loop for keeping unreliable agents on track is a practical idea worth further exploration in the PL and software-engineering communities.

major comments (1)
  1. [Evaluation] Evaluation section (and abstract claim): the central empirical claim rests on 100% pass rates for supplied suites plus 62%–92% on held-out validation tests. The manuscript does not describe how the validation tests were constructed, what coverage they achieve relative to the interpreters’ input space, or the nature of the failing cases. Because interpreters process untrusted inputs and the validation rates already show divergence, this gap directly affects whether the results establish functional equivalence or merely that the translations match the original on the tested subset.
minor comments (1)
  1. [Security case study] The security case study demonstrates removal of the specific reported vulnerabilities but does not discuss whether the Rust version could introduce new classes of issues (e.g., logic errors that become exploitable only after the memory-safety layer is removed).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the need for greater transparency in our evaluation methodology. We agree that the current description of the validation tests is insufficient and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section (and abstract claim): the central empirical claim rests on 100% pass rates for supplied suites plus 62%–92% on held-out validation tests. The manuscript does not describe how the validation tests were constructed, what coverage they achieve relative to the interpreters’ input space, or the nature of the failing cases. Because interpreters process untrusted inputs and the validation rates already show divergence, this gap directly affects whether the results establish functional equivalence or merely that the translations match the original on the tested subset.

    Authors: We agree that the manuscript lacks sufficient detail on the validation tests. In the revised version we will add a new subsection (Evaluation, §5.3) that: (1) explains the construction process—validation suites were manually authored by the authors using each language’s specification, public test corpora, and targeted edge cases deliberately omitted from the original test suites; (2) reports available coverage metrics (statement and branch coverage measured on the original C sources via gcov where feasible, ranging 41–67 %); and (3) provides a categorized breakdown of the failing cases (e.g., differences in floating-point rounding, unsupported Unicode edge cases, and minor semantic divergences in error reporting). We will also explicitly state the inherent limits of any finite test suite for interpreters handling untrusted input. These additions will make clear that the reported 62–92 % rates reflect partial but non-trivial coverage rather than full functional equivalence. revision: yes

Circularity Check

0 steps flagged

No circularity: results are empirical test outcomes, not derived by construction

full rationale

The paper presents an empirical technique (Reboot) evaluated by applying it to six real C interpreters, running the resulting Rust code against supplied test suites (100% pass) and held-out validation tests (62-92% pass), plus a security case study. No equations, fitted parameters, or derivations are used; the reported pass rates are direct execution results on external inputs. No self-citation chains, ansatzes, or uniqueness theorems underpin the central claims. The method and evaluation are self-contained against the test oracles provided.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the empirical success of the Reboot workflow; no free parameters, mathematical axioms, or new postulated entities are invoked in the abstract.

pith-pipeline@v0.9.1-grok · 5809 in / 1272 out tokens · 25668 ms · 2026-06-26T01:28:51.444913+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

69 extracted references · 6 canonical work pages

  1. [1]

    [n. d.]. Microsoft is busy rewriting core Windows code in memory-safe Rust. https://www.theregister.com/2023/04/ 27/microsoft_windows_rust/

  2. [2]

    [n. d.]. Strap in, get ready for more Rust drivers in Linux kernel. https://www.theregister.com/2025/03/10/rust_ drivers_expected_to_become/

  3. [3]

    Yubo Bai and Tapti Palit. 2025. RustAssure: Differential Symbolic Testing for LLM-Transpiled C-to-Rust Code. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE

  4. [4]

    Xuemeng Cai, Jiakun Liu, Xiping Huang, Yijun Yu, Haitao Wu, Chunmiao Li, Bo Wang, Imam Nur Bani Yusuf, and Lingxiao Jiang. 2025. RustMap: Towards Project-Scale C-to-Rust Migration via Program Analysis and LLM.arXiv preprint arXiv:2503.17741(2025)

  5. [5]

    Haogang Chen, Cody Cutler, Taesoo Kim, Yandong Mao, Xi Wang, Nickolai Zeldovich, and M Frans Kaashoek. 2013. Security bugs in embedded interpreters. InProceedings of the 4th Asia-Pacific Workshop on Systems. 1–7

  6. [6]

    Saman Dehghan, Tianran Sun, Tianxiang Wu, Zihan Li, and Reyhaneh Jabbarvand. 2025. Translating Large-Scale C Repositories to Idiomatic Rust.arXiv preprint arXiv:2511.20617(2025)

  7. [7]

    Mehmet Emre, Peter Boyland, Aesha Parekh, Ryan Schroeder, Kyle Dewey, and Ben Hardekopf. 2023. Aliasing Limits on Translating C to Safe Rust.Proceedings of the ACM on Programming Languages7, OOPSLA1 (2023), 551–579

  8. [8]

    Mehmet Emre, Ryan Schroeder, Kyle Dewey, and Ben Hardekopf. 2021. Translating C to safer Rust.Proceedings of the ACM on Programming Languages5, OOPSLA (2021), 1–29

  9. [9]

    Hasan Ferit Eniser, Hanliang Zhang, Cristina David, Meng Wang, Brandon Paulsen, Joey Dodds, and Daniel Kroening

  10. [10]

    Towards Translating Real-World Code with LLMs: A Study of Translating to Rust.arXiv preprint arXiv:2405.11514 (2024)

  11. [11]

    Muhammad Farrukh, Smeet Shah, Baris Coskun, and Michalis Polychronakis. 2025. SafeTrans: LLM-assisted Transpi- lation from C to Rust.arXiv preprint arXiv:2505.10708(2025)

  12. [12]

    Aymeric Fromherz and Jonathan Protzenko. 2024. Compiling C to Safe Rust, Formalized. arXiv:2412.15042 [cs.PL] https://arxiv.org/abs/2412.15042

  13. [13]

    Google. [n. d.]. OSS-Fuzz vulnerabilities Github repository. https://github.com/google/oss-fuzz-vulns Accessed: July, 2024

  14. [14]

    grahama1970. 2025. [BUG] Claude CLI produces non-deterministic output for identical inputs. GitHub Issue. https://github.com/anthropics/claude-code/issues/3370 Issue #3370, anthropics/claude-code. Status: Closed as not planned

  15. [15]

    Jaemin Hong and Sukyoung Ryu. 2024. Don’t Write, but Return: Replacing Output Parameters with Algebraic Data Types in C-to-Rust Translation.Proc. ACM Program. Lang.8, PLDI, Article 176 (June 2024), 25 pages. https: //doi.org/10.1145/3656406

  16. [16]

    Jaemin Hong and Sukyoung Ryu. 2024. To Tag, or Not to Tag: Translating C’s Unions to Rust’s Tagged Unions. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering(Sacramento, CA, USA) (ASE ’24). Association for Computing Machinery, New York, NY, USA, 40–52. https://doi.org/10.1145/3691620.3694985

  17. [17]

    Jaemin Hong and Sukyoung Ryu. 2025. Forcrat: Automatic I/O API Translation from C to Rust via Origin and Capability Analysis. arXiv:2506.01427 [cs.SE] https://arxiv.org/abs/2506.01427

  18. [18]

    Jaemin Hong and Sukyoung Ryu. 2025. Type-migrating C-to-Rust translation using a large language model.Empirical Software Engineering30, 1 (2025), 3. 26 Bo Wang, Brandon Paulsen, Joey Dodds, Daniel Kroening, Umang Mathur, and Prateek Saxena

  19. [19]

    Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, and Jürgen Schmidhuber

  20. [20]

    InThe Twelfth International Conference on Learning Representations

    MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=VtmBAGCN7o

  21. [21]

    Dong Huang, Jie M Zhang, Michael Luck, Qingwen Bu, Yuhao Qing, and Heming Cui. 2023. Agentcoder: Multi-agent- based code generation with iterative testing and optimisation.arXiv preprint arXiv:2312.13010(2023)

  22. [22]

    Ali Reza Ibrahimzada, Kaiyao Ke, Mrigank Pawagi, Muhammad Salman Abid, Rangeet Pan, Saurabh Sinha, and Reyhaneh Jabbarvand. 2025. AlphaTrans: A Neuro-Symbolic Compositional Approach for Repository-Level Code Translation and Validation.Proceedings of the ACM on Software Engineering2, FSE (2025), 2454–2476

  23. [23]

    Ali Reza Ibrahimzada, Brandon Paulsen, Reyhaneh Jabbarvand, Joey Dodds, and Daniel Kroening. 2025. MatchFix- Agent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair.arXiv preprint arXiv:2509.16187(2025)

  24. [24]

    Immunant. [n. d.]. c2rust: Migrate C code to Rust. https://github.com/immunant/c2rust. Accessed: July 4, 2023

  25. [25]

    jameysharp. [n. d.]. corrode: C to Rust translator. https://github.com/jameysharp/corrode. Accessed: July 4, 2023

  26. [26]

    2016.Integer Overflow Vulnerabilities in Language Interpreters

    Yeongjin Jang. 2016.Integer Overflow Vulnerabilities in Language Interpreters. https://gts3.org/2016/lang-bug.html Accessed: 2026-01-27

  27. [27]

    Chengman Jiang, Baojian Hua, Wanrong Ouyang, Qiliang Fan, and Zhizhong Pan. 2021. PyGuard: Finding and Understanding Vulnerabilities in Python Virtual Machines. In2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, 468–475

  28. [28]

    Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised translation of programming languages.arXiv preprint arXiv:2006.03511(2020)

  29. [29]

    Ruishi Li, Bo Wang, Tianyu Li, Prateek Saxena, and Ashish Kundu. 2024. Translating C To Rust: Lessons from a User Study.arXiv preprint arXiv:2411.14174(2024)

  30. [30]

    Tianyu Li, Ruishi Li, Bo Wang, Brandon Paulsen, Umang Mathur, and Prateek Saxena. 2025. Adversarial Agent Collaboration for C to Rust Translation.arXiv preprint arXiv:2510.03879(2025)

  31. [31]

    Michael Ling, Yijun Yu, Haitao Wu, Yuan Wang, James R Cordy, and Ahmed E Hassan. 2022. In rust we trust: a transpiler from unsafe C to safer rust. InProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. 354–355

  32. [32]

    Feng Luo, Kexing Ji, Cuiyun Gao, Shuzheng Gao, Jia Feng, Kui Liu, Xin Xia, and Michael R. Lyu. 2025. Integrating Rules and Semantics for LLM-Based C-to-Rust Translation. InProceedings of the 41st IEEE International Conference on Software Maintenance and Evolution (ICSME). 685–696. https://doi.org/10.1109/ICSME64153.2025.00069

  33. [33]

    Santosh Nagarakatte, Jianzhou Zhao, Milo M. K. Martin, and Steve Zdancewic. 2009. SoftBound: highly compatible and complete spatial memory safety for c. InACM-SIGPLAN Symposium on Programming Language Design and Implementation. https://api.semanticscholar.org/CorpusID:248719

  34. [34]

    Santosh Nagarakatte, Jianzhou Zhao, Milo M. K. Martin, and Steve Zdancewic. 2010. CETS: compiler enforced temporal safety for C. InInternational Symposium on Mathematical Morphology and Its Application to Signal and Image Processing. https://api.semanticscholar.org/CorpusID:914358

  35. [35]

    Vikram Nitin, Rahul Krishna, and Baishakhi Ray. 2024. Spectra: Enhancing the code translation ability of language models by generating multi-modal specifications.arXiv preprint arXiv:2405.18574(2024)

  36. [36]

    Vikram Nitin, Rahul Krishna, Luiz Lemos do Valle, and Baishakhi Ray. 2025. C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques.arXiv preprint arXiv:2501.14257(2025)

  37. [37]

    Guangsheng Ou, Mingwei Liu, Yuxuan Chen, Xueying Du, Shengbo Wang, Zekai Zhang, Xin Peng, and Zibin Zheng

  38. [38]

    Enhancing LLM-based Code Translation in Repository Context via Triple Knowledge-Augmented.arXiv preprint arXiv:2503.18305(2025)

  39. [39]

    Taemin Park, Karel Dhondt, David Gens, Yeoul Na, Stijn Volckaert, and Michael Franz. 2020. NoJITsu: Locking Down JavaScript Engines. InProceedings 2020 Network and Distributed System Security Symposium. Internet Society

  40. [40]

    Taemin Park, Julian Lettner, Yeoul Na, Stijn Volckaert, and Michael Franz. 2018. Bytecode corruption attacks are real—and how to defend against them. InInternational Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 326–348

  41. [41]

    Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. 2024. ChatDev: Communicative Agents for Software Development. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 15174–15186

  42. [42]

    Qian Sang, Yanhao Wang, Yuwei Liu, Xiangkun Jia, Tiffany Bao, and Purui Su. 2024. Airtaint: Making dynamic taint analysis faster and easier. In2024 IEEE symposium on security and privacy (SP). IEEE, 3998–4014

  43. [43]

    Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. {AddressSanitizer}: A fast address sanity checker. In2012 USENIX annual technical conference (USENIX ATC 12). 309–318. Mostly Automatic Translation of Language Interpreters from C to Safe Rust 27

  44. [44]

    Seshia, and Koushik Sen

    Manish Shetty, Naman Jain, Adwait Godbole, Sanjit A. Seshia, and Koushik Sen. 2024. Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis. arXiv:2412.14234 [cs.SE] https://arxiv.org/abs/2412.14234

  45. [45]

    Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning.Advances in Neural Information Processing Systems36 (2023), 8634–8652

  46. [46]

    Momoko Shiraishi, Yinzhi Cao, and Takahiro Shinagawa. 2026. SmartC2Rust: Iterative, Feedback-Driven C-to- Rust Translation via Large Language Models for Safety and Equivalence. InProceedings of the 48th International Conference on Software Engineering (ICSE ’26). Association for Computing Machinery, Rio de Janeiro, Brazil. https: //doi.org/10.1145/3744916.3773259

  47. [47]

    HoHyun Sim, Hyeonjoong Cho, Yeonghyeon Go, Zhoulai Fu, Ali Shokri, and Binoy Ravindran. 2025. Large Language Model-Powered Agent for C to Rust Code Translation.arXiv preprint arXiv:2505.15858(2025)

  48. [48]

    Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song. 2013. SoK: Eternal War in Memory. In2013 IEEE Symposium on Security and Privacy. IEEE, 48–62

  49. [49]

    The Rust Project Developers. 2026. std::result — The Rust Standard Library. https://doc.rust-lang.org/std/result/ Accessed: 2026-03-17

  50. [50]

    Bo Wang, Tianyu Li, Ruishi Li, Umang Mathur, and Prateek Saxena. 2025. Program Skeletons for Automated Program Translation.Proceedings of the ACM on Programming Languages9, PLDI (2025), 920–944

  51. [51]

    Chaofan Wang, Tingrui Yu, Beijun Shen, Jie Wang, Dong Chen, Wenrui Zhang, Yuling Shi, Chen Xie, and Xiaodong Gu. 2026. EvoC2Rust: A Skeleton-guided Framework for Project-Level C-to-Rust Translation. InProceedings of the 48th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). arXiv:2508.04295

  52. [52]

    Shengbo Wang, Mingwei Liu, Guangsheng Ou, Yuwen Chen, Zike Li, Yanlin Wang, and Zibin Zheng. 2026. His2Trans: A Skeleton First Framework for Self Evolving C to Rust Translation with Historical Retrieval.arXiv preprint arXiv:2603.02617(2026)

  53. [53]

    Wikipedia contributors. 2026. setjmp.h. Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Setjmp.h Accessed: 2026-03-17

  54. [54]

    Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. 2024. Autogen: Enabling next-gen LLM applications via multi-agent conversations. InFirst Conference on Language Modeling

  55. [55]

    Xiafa Wu and Brian Demsky. 2025. GenC2Rust: Towards Generating Generic Rust Code from C. InProceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE). 90–102. https://doi.org/10.1109/ICSE55347. 2025.00127

  56. [56]

    Qingxiao Xu and Jeff Huang. 2025. Optimizing Type Migration for LLM-Based C-to-Rust Translation: A Data Flow Graph Approach. InProceedings of the 14th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP ’25). Association for Computing Machinery, 8–14. https://doi.org/10.1145/3735544.3735582

  57. [57]

    Aidan ZH Yang, Yoshiki Takashima, Brandon Paulsen, Josiah Dodds, and Daniel Kroening. 2025. VERT: Polyglot Verified Equivalent Rust Transpilation with Large Language Models. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1453–1463

  58. [58]

    Zhiqiang Yuan, Wenjun Mao, Zhuo Chen, Xiyue Shang, Chong Wang, Yiling Lou, and Xin Peng. 2025. Project-Level C-to-Rust Translation via Synergistic Integration of Knowledge Graphs and Large Language Models.arXiv preprint arXiv:2510.10956(2025)

  59. [59]

    Hanliang Zhang, Cristina David, Meng Wang, Brandon Paulsen, and Daniel Kroening. 2025. Scalable, validated code translation of entire projects using large language models.Proceedings of the ACM on Programming Languages9, PLDI (2025), 1616–1641

  60. [60]

    Hanliang Zhang, Cristina David, Yijun Yu, and Meng Wang. 2023. Ownership guided C to Rust translation.arXiv preprint arXiv:2303.10515(2023)

  61. [61]

    Hanliang Zhang, Arindam Sharma, Cristina David, Meng Wang, Brandon Paulsen, Daniel Kroening, Wenjia Ye, and Taro Sekiyama. 2026. Validated Code Translation for Projects with External Libraries. arXiv:2602.18534 [cs.SE] https://arxiv.org/abs/2602.18534

  62. [62]

    meta agent

    Tianyang Zhou, Haowen Lin, Somesh Jha, Mihai Christodorescu, Kirill Levchenko, and Varun Chandrasekaran. 2025. LLM-Driven Multi-step Translation from C to Rust using Static Analysis.arXiv preprint arXiv:2503.12511(2025). A More Detailed Statistics of the Translation Process This appendix provides additional detailed statistics from our evaluation ofReboot...

  63. [63]

    Notice the context check marker appears FIRST, before the`## MESSAGE::TO_MONITOR`line

  64. [64]

    You MUST read the template file fresh each time before using it (in case user modified it)

  65. [65]

    172 173 174 175# ===DECIDING-CASE-A-OR-B=== 176 177When you receive a message from the monitor: 178

    Replace`{{N}}`with the current task ID value (0 for the first task). 172 173 174 175# ===DECIDING-CASE-A-OR-B=== 176 177When you receive a message from the monitor: 178

  66. [66]

    **Check`MONITOR_EVENT`**: If it is NOT`TASK_RESULT_*`(where`*`is arbitrary number, e.g.,`TASK_RESULT_0`), escalate to the human user (Case B). 180

  67. [67]

    **If`MONITOR_EVENT`is`TASK_RESULT_*`**: Read the result message and check if the task was successful: 182- If successful (result indicates task completed): This is **Case A** - proceed to next template 183- If not successful (result indicates failure/needs attention): This is **Case B** - escalate to user 184

  68. [68]

    **In Case A** (successful task): 186- Determine the next template letter from the table in`# ===TEMPLATE-FILE-LIST===` 187- If next template exists: Read that template file (e.g.,`./.meta_supp/tmpl_task_b_create_features.md`) 188- Replace`{{N}}`with the next task ID (increment from previous) 189- Send`TASK_APPEND`message with the template content 190- If ...

  69. [69]

    15" or "14.4

    **In Case B** (task failed or needs attention): Escalate to user 193 194**After escalation**: When user responds, they may instruct you to: 195- Retry the same template (use same letter, but increment task ID) 196- Skip to a different template 197- Make code changes and then continue 198- Or any other action 199 200 201 Mostly Automatic Translation of Lan...