arxiv: 2604.02852 · v1 · submitted 2026-04-03 · 💻 cs.SE

Recognition: 2 theorem links

· Lean Theorem

Dependency-Guided Repository-Level C-to-Rust Translation with Reinforcement Alignment

Chaozheng Wang, Cuiyun Gao, Feng Luo, Ge Li, Jia Feng, Kui Liu, Wenjie Gan, Xin Xia

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:08 UTC · model grok-4.3

classification 💻 cs.SE

keywords C-to-Rust translationrepository-level migrationreinforcement learning for codedependency-guided refinementLLM code generationsoftware securitycross-file dependenciescode translation benchmark

0 comments

The pith

DepTrans translates entire C repositories to Rust by using dependency guidance and reinforcement to reach 60.7 percent compilation success.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents DepTrans as a way to automate conversion of complete C codebases into Rust while respecting cross-file dependencies. It improves on standard LLM approaches by first applying reinforcement-aligned training that combines multi-task fine-tuning with feedback signals, then running dependency-guided iterative refinement on the generated code. These steps produce Rust output that compiles and runs correctly more often than prior methods on a new benchmark of 145 repositories. The work matters because successful large-scale migration could reduce security risks in legacy C software without requiring full manual rewrites. Results include 43.5 percent computational accuracy and the ability to build 7 of 15 real industrial projects.

Core claim

DepTrans achieves a 60.7 percent compilation success rate and 43.5 percent computational accuracy on repository-level C-to-Rust translation by combining Reinforcement-Aligned Syntax Training for multi-task fine-tuning and feedback-driven reinforcement learning with Dependency-Guided Iterative Refinement to capture fine-grained cross-file dependencies and iteratively refine the output code.

What carries the argument

Dependency-Guided Iterative Refinement, which extracts fine-grained cross-file dependencies from C source and uses them to guide successive rounds of code correction, paired with Reinforcement-Aligned Syntax Training that applies multi-task fine-tuning plus reinforcement learning signals.

If this is right

Repository-level translation becomes practical for a larger fraction of existing C projects, as evidenced by successful builds on 7 of 15 industrial cases.
LLM-based code migration can avoid the accuracy loss that comes from either ignoring dependencies or loading entire files as context.
Structured inference steps make it feasible to verify syntactic correctness and functional equivalence at the scale of full repositories rather than single functions.
Performance improves measurably when reinforcement signals are added to standard fine-tuning on parallel C-Rust data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dependency-guided loop could be applied to migration tasks between other language pairs that share complex module structures.
Further iterations of the refinement process might be extended to enforce Rust-specific safety properties such as ownership rules beyond basic compilation.
The approach could be tested on repositories that contain heavy use of macros or pointer arithmetic to check robustness on harder C idioms.
Combining the current reinforcement signal with additional static-analysis oracles might produce even tighter equivalence guarantees.

Load-bearing premise

The constructed 85k training samples and 145-repository benchmark are representative enough of real-world cross-file dependencies that the measured gains from reinforcement and iterative refinement will hold on unseen projects.

What would settle it

Evaluating DepTrans on a fresh collection of 20 industrial C repositories drawn from a different domain and checking whether the compilation success rate falls below 45 percent.

Figures

Figures reproduced from arXiv: 2604.02852 by Chaozheng Wang, Cuiyun Gao, Feng Luo, Ge Li, Jia Feng, Kui Liu, Wenjie Gan, Xin Xia.

**Figure 1.** Figure 1: Overview of DepTrans. remains a bottleneck. LLMs frequently struggle with cross-file structural dependencies and lack a specialized “translation-detectionrepair” synergy. To bridge this gap, we propose ReinforcementAligned Syntax Training (RAST), a two-stage paradigm designed to bolster the model’s fundamental awareness of global repository logic and its adaptive capacity for the migration task, as show… view at source ↗

**Figure 2.** Figure 2: Case study of authenticate_stream translation gen [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Impact of compilation repair and consistency repair [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

Automating C-to-Rust migration is critical for improving software security without sacrificing performance. Traditional rule-based methods struggle with diverse C idioms, often producing rigid and unidiomatic Rust code. Large Language Models (LLMs), trained on massive code corpora, offer a promising alternative by leveraging cross-language generalization to generate more idiomatic and maintainable Rust code. However, several challenges remain. First, existing LLM-based approaches fail to handle cross-file dependencies effectively, either ignoring them or including entire files as context, which limits accurate dependency modeling. Second, complex dependencies and structured inputs and outputs make it difficult to verify syntactic correctness and functional equivalence at the repository level. Third, the lack of large-scale C-Rust parallel data constrains model performance. We propose DepTrans, a framework that combines model capability enhancement with structured inference. DepTrans introduces Reinforcement-Aligned Syntax Training to improve generation quality through multi-task fine-tuning and feedback-driven reinforcement learning. It further applies Dependency-Guided Iterative Refinement to capture fine-grained cross-file dependencies and iteratively refine generated Rust code. We construct a dataset of 85k training samples and a benchmark of 145 repository-level instances. Experiments show that DepTrans achieves a 60.7 percent compilation success rate and 43.5 percent computational accuracy, outperforming the strongest baseline by 22.8 and 17.3 percentage points. It also successfully builds 7 of 15 industrial C projects, demonstrating its practical potential.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DepTrans shows solid empirical gains on a new repo-level C-to-Rust benchmark but leaves the specific value of its dependency iteration and RL steps unisolated.

read the letter

DepTrans reaches 60.7 percent compilation success and 43.5 percent computational accuracy on its 145-repository benchmark, beating the strongest baseline by 22.8 and 17.3 points, and it successfully builds 7 of 15 industrial C projects. That is a usable engineering result for a problem that has been stuck on smaller or single-file cases for a while. The framework adds reinforcement-aligned syntax training on top of multi-task fine-tuning and then runs dependency-guided iterative refinement at inference time to pull in cross-file context and fix the generated Rust. They also ship an 85k-sample training set and the new benchmark, which is more than most code-translation papers provide. The industrial numbers give the work some practical weight that pure academic benchmarks often lack. The numbers are reported against external compilation and test outcomes rather than self-referential fits, which keeps the evaluation honest. The main gap is the lack of ablations that separate the iterative refinement from the base fine-tuned model and the RL component from plain supervised training. Without those breakdowns it is difficult to know how much the dependency guidance actually moves the needle versus simply having better data or a stronger base LLM. The benchmark is drawn from the same distribution as the training samples, so it is still unclear how well the gains hold on arbitrary C repositories with messier or larger dependency graphs. Selection criteria and dependency statistics are not detailed enough to judge representativeness. This paper is for people building or evaluating LLM tools for legacy migration and memory-safety work. A reader who needs concrete numbers on repo-scale translation will find usable data here. It is coherent enough and grounded enough in external outcomes to deserve a serious referee, though any review will likely ask for the missing ablations and a clearer generalization argument. I would send it to review.

Referee Report

2 major / 1 minor

Summary. The paper introduces DepTrans, a framework for repository-level C-to-Rust translation. It combines Reinforcement-Aligned Syntax Training (multi-task fine-tuning plus feedback-driven RL) with Dependency-Guided Iterative Refinement to model cross-file dependencies. The authors build an 85k-sample training set and a 145-repository benchmark, reporting 60.7% compilation success and 43.5% computational accuracy (outperforming the strongest baseline by 22.8 and 17.3 points) plus successful builds on 7 of 15 industrial C projects.

Significance. If the evaluation holds, the work advances automated migration by explicitly addressing cross-file dependencies at repository scale rather than treating files in isolation. The combination of RL syntax alignment and iterative refinement is a concrete methodological contribution, and the industrial-project results add practical relevance beyond synthetic benchmarks.

major comments (2)

[§4] §4 (Experimental Setup): The 145-repository benchmark is described as derived from the 85k training samples, yet no selection criteria, dependency-graph statistics (e.g., average number of cross-file calls or include depth), or distribution analysis are provided. This is load-bearing for the central claim because the reported 60.7% and 43.5% figures cannot be interpreted as evidence of improved dependency handling without evidence that the benchmark reflects representative real-world cross-file structures.
[§5] §5 (Results and Ablations): No ablation isolates the contribution of Dependency-Guided Iterative Refinement or the reinforcement-learning stage from the base LLM. The 22.8-point gain over the strongest baseline is therefore unattributed, leaving open the possibility that the improvement stems from data scale or prompting rather than the proposed dependency and RL mechanisms.

minor comments (1)

[Abstract] The abstract and §3.1 would benefit from a brief statement of how the 85k parallel samples were filtered or deduplicated to avoid test-set leakage into the training distribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of experimental rigor that we will address through targeted revisions. Below we respond point by point to the major comments.

read point-by-point responses

Referee: §4 (Experimental Setup): The 145-repository benchmark is described as derived from the 85k training samples, yet no selection criteria, dependency-graph statistics (e.g., average number of cross-file calls or include depth), or distribution analysis are provided. This is load-bearing for the central claim because the reported 60.7% and 43.5% figures cannot be interpreted as evidence of improved dependency handling without evidence that the benchmark reflects representative real-world cross-file structures.

Authors: We agree that the manuscript currently lacks sufficient detail on benchmark construction. In the revised version we will add a dedicated subsection in §4 that specifies the selection criteria (repositories must contain at least three cross-file function calls and an include depth of two or greater), reports dependency-graph statistics (average cross-file calls per repository, average include depth, and distribution of repository sizes), and includes an analysis of how the 145 instances were sampled from the 85k pool to ensure representativeness of real-world dependency structures. revision: yes
Referee: §5 (Results and Ablations): No ablation isolates the contribution of Dependency-Guided Iterative Refinement or the reinforcement-learning stage from the base LLM. The 22.8-point gain over the strongest baseline is therefore unattributed, leaving open the possibility that the improvement stems from data scale or prompting rather than the proposed dependency and RL mechanisms.

Authors: We acknowledge that the current results section does not contain component-wise ablations for the reinforcement-learning stage or the Dependency-Guided Iterative Refinement. We will add these ablations to §5 in the revision, comparing the full DepTrans model against (i) the base LLM fine-tuned only on the 85k samples and (ii) the model with RL alignment but without iterative refinement. The new tables will quantify the incremental contribution of each proposed mechanism. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes an empirical ML system (DepTrans) for repository-level C-to-Rust translation that combines fine-tuning, reinforcement learning, and iterative refinement. All load-bearing claims are experimental performance numbers (compilation success rate, computational accuracy) measured against external outcomes on a held-out benchmark of 145 repositories plus 15 industrial projects. These metrics are not defined by the model's fitted parameters, nor do any equations or steps reduce the reported results to self-referential inputs by construction. No self-definitional relations, fitted-input predictions, or load-bearing self-citation chains appear in the provided text. The evaluation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on the standard assumption that large language models trained on code corpora can be further aligned via reinforcement learning to produce compilable output, plus the domain assumption that cross-file call graphs extracted from C can be reliably mapped to Rust contexts. No new physical entities or ad-hoc constants are introduced.

axioms (2)

domain assumption Large language models trained on code can be improved for compilation and functional correctness via reinforcement learning signals derived from compiler feedback.
Invoked in the description of Reinforcement-Aligned Syntax Training.
domain assumption Fine-grained cross-file dependencies can be extracted from C source and used as additional context to improve translation accuracy.
Central to Dependency-Guided Iterative Refinement.

pith-pipeline@v0.9.0 · 5578 in / 1584 out tokens · 36342 ms · 2026-05-13T20:08:04.489804+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
DepTrans introduces Reinforcement-Aligned Syntax Training ... Dependency-Guided Iterative Refinement ... 60.7% compilation success rate
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
hybrid reward function R(a) = α R_comp + β R_align ... GRPO Loss

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 6 internal anchors

[1]

[n. d.]. Rust (programming language). https://en.wikipedia.org/wiki/Rust_ (programming_language)

work page
[2]

[n. d.]. tree-sitter. 2023. https://github.com/tree-sitter/tree-sitter

work page 2023
[3]

Amro Abbas, Kushal Tirumala, Dániel Simig, Surya Ganguli, and Ari S Mor- cos. 2023. Semdedup: Data-efficient learning at web-scale through semantic deduplication.arXiv preprint arXiv:2303.09540(2023)

work page arXiv 2023
[4]

Yannick Assogba and Donghao Ren. [n. d.]. Evaluating Long Range Dependency Handling in Code Generation LLMs.Transactions on Machine Learning Research ([n. d.])

work page
[5]

Xuemeng Cai, Jiakun Liu, Xiping Huang, Yijun Yu, Haitao Wu, Chunmiao Li, Bo Wang, Imam Nur Bani Yusuf, and Lingxiao Jiang. 2025. RustMap: Towards Project- Scale C-to-Rust Migration via Program Analysis and LLM.CoRRabs/2503.17741 (2025). arXiv:2503.17741 doi:10.48550/ARXIV.2503.17741

work page doi:10.48550/arxiv.2503.17741 2025
[6]

Jianlyu Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu

work page
[7]

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation. InFindings of the Asso- ciation for Computational Linguistics, ACL 2024, Bangkok, Thailand and virtual meeting, August 11-16, 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, 2318...

work page doi:10.18653/v1/ 2024
[8]

John Criswell, Nicolas Geoffray, and Vikram S. Adve. 2009. Memory Safety for Low-Level Software/Hardware Interactions. In18th USENIX Security Symposium, Montreal, Canada, August 10-14, 2009, Proceedings, Fabian Monrose (Ed.). USENIX Association, 83–100

work page 2009
[9]

DARPA. [n. d.]. TRACTOR: Translating All C to Rust. https://www.darpa.mil/ research/programs/translating-all-c-to-rust

work page
[10]

Mehmet Emre, Peter Boyland, Aesha Parekh, Ryan Schroeder, Kyle Dewey, and Ben Hardekopf. 2023. Aliasing Limits on Translating C to Safe Rust.Proc. ACM Program. Lang.7, OOPSLA1 (2023), 551–579. doi:10.1145/3586046

work page doi:10.1145/3586046 2023
[11]

Mehmet Emre, Ryan Schroeder, Kyle Dewey, and Ben Hardekopf. 2021. Trans- lating C to safer Rust.Proc. ACM Program. Lang.5, OOPSLA (2021), 1–29. doi:10.1145/3485498

work page doi:10.1145/3485498 2021
[12]

Jia Feng, Jiachen Liu, Cuiyun Gao, Chun Yong Chong, Chaozheng Wang, Shan Gao, and Xin Xia. 2024. Complexcodeeval: A benchmark for evaluating large code models on more complex code. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. 1895–1906

work page 2024
[13]

Pietro Ferrara, Vincenzo Arceri, and Agostino Cortesi. 2024. Challenges of software verification: the past, the present, the future.International Journal on Software Tools for Technology Transfer26, 4 (2024), 421–430

work page 2024
[14]

Aymeric Fromherz and Jonathan Protzenko. 2024. Compiling C to Safe Rust, Formalized.CoRRabs/2412.15042 (2024). arXiv:2412.15042 doi:10.48550/ARXIV. 2412.15042

work page internal anchor Pith review doi:10.48550/arxiv 2024
[15]

Yifei Gao, Chengpeng Wang, Pengxiang Huang, Xuwei Liu, Mingwei Zheng, and Xiangyu Zhang. 2025. PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust.CoRRabs/2505.04852 (2025). arXiv:2505.04852 doi:10.48550/ARXIV.2505.04852

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.04852 2025
[16]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al . 2025. DeepSeek-R1 in- centivizes reasoning in LLMs through reinforcement learning.Nature645, 8081 (2025), 633–638

work page 2025
[17]

Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Yu Wu, YK Li, et al. 2024. DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence.arXiv preprint arXiv:2401.14196(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[18]

Jaemin Hong and Sukyoung Ryu. 2023. Concrat: An Automatic C-to-Rust Lock API Translator for Concurrent Programs. In45th IEEE/ACM International Confer- ence on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 716–728. doi:10.1109/ICSE48619.2023.00069

work page doi:10.1109/icse48619.2023.00069 2023
[19]

Jaemin Hong and Sukyoung Ryu. 2024. To Tag, or Not to Tag: Translating C’s Unions to Rust’s Tagged Unions. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, ASE 2024, Sacramento, CA, USA, October 27 - November 1, 2024, Vladimir Filkov, Baishakhi Ray, and Minghui Zhou (Eds.). ACM, 40–52. doi:10.1145/3691620.3694985

work page doi:10.1145/3691620.3694985 2024
[20]

Jaemin Hong and Sukyoung Ryu. 2025. Forcrat: Automatic I/O API Translation from C to Rust via Origin and Capability Analysis. arXiv:2506.01427 [cs.SE] https://arxiv.org/abs/2506.01427

work page arXiv 2025
[21]

Baizhou Huang, Shuai Lu, Xiaojun Wan, and Nan Duan. 2024. Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (E...

work page doi:10.18653/v1/2024.acl-long.78 2024
[22]

Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. 2024. Qwen2. 5-coder technical report.arXiv preprint arXiv:2409.12186(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[23]

Immunant, Inc. [n. d.]. C2Rust: Source-to-Source C-to-Rust Transpiler. https: //c2rust.com

work page
[24]

Saiful Bari, Xuan Do Long, Weishi Wang, Md

Mohammad Abdullah Matin Khan, M. Saiful Bari, Xuan Do Long, Weishi Wang, Md. Rizwan Parvez, and Shafiq Joty. 2024. XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Gen- eration, Translation and Retrieval. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: L...

work page doi:10.18653/v1/2024.acl-long.367 2024
[25]

Anirudh Khatry, Robert Zhang, Jia Pan, Ziteng Wang, Qiaochu Chen, Greg Durrett, and Isil Dillig. 2025. CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation.CoRRabs/2504.15254 (2025). arXiv:2504.15254 doi:10.48550/ARXIV.2504.15254

work page doi:10.48550/arxiv.2504.15254 2025
[26]

Per Larsen. 2024. Migrating C to Rust for Memory Safety.IEEE Secur. Priv.22, 4 (2024), 22–29. doi:10.1109/MSEC.2024.3385357

work page doi:10.1109/msec.2024.3385357 2024
[27]

Kornel Lesiński. [n. d.]. Speed of Rust vs C. https://kornel.ski/rust-c-speed

work page
[28]

Jia Li, Hao Zhu, Huanyu Liu, Xianjie Shi, He Zong, Yihong Dong, Kechi Zhang, Siyuan Jiang, Zhi Jin, and Ge Li. 2025. aiXcoder-7B-v2: Training LLMs to Fully Utilize the Long Context in Repository-level Code Completion.arXiv preprint arXiv:2503.15301(2025)

work page arXiv 2025
[29]

Ruishi Li, Bo Wang, Tianyu Li, Prateek Saxena, and Ashish Kundu. 2025. Translat- ing C To Rust: Lessons from a User Study. In32nd Annual Network and Distributed System Security Symposium, NDSS 2025, San Diego, California, USA, February 24-28,

work page 2025
[30]

The Internet Society

work page
[31]

Shuqing Li, Qisheng Zheng, Cuiyun Gao, Jia Feng, and Michael R. Lyu. 2025. Extended Reality Cybersickness Assessment via User Review Analysis.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA058 (June 2025), 23 pages. doi:10.1145/3728933

work page doi:10.1145/3728933 2025
[32]

Jiaheng Liu, Dawei Zhu, Zhiqi Bai, Yancheng He, Huanxuan Liao, Haoran Que, Zekun Wang, Chenchen Zhang, Ge Zhang, Jiebin Zhang, et al. 2025. A comprehen- sive survey on long context language modeling.arXiv preprint arXiv:2503.17407 (2025)

work page arXiv 2025
[33]

Feng Luo, Kexing Ji, Cuiyun Gao, Shuzheng Gao, Jia Feng, Kui Liu, Xin Xia, and Michael R Lyu. 2025. Integrating Rules and Semantics for LLM-Based C-to-Rust Translation.arXiv preprint arXiv:2508.06926(2025)

work page arXiv 2025
[34]

Microsoft Security Response Center. 2019. A Proactive Approach to More Secure Code. https://msrc.microsoft.com/blog/2019/07/a-proactive-approach-to-more- secure-code/

work page 2019
[35]

Vikram Nitin, Rahul Krishna, Luiz Lemos do Valle, and Baishakhi Ray. 2025. C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Tech- niques.CoRRabs/2501.14257 (2025). arXiv:2501.14257 doi:10.48550/ARXIV.2501. 14257

work page doi:10.48550/arxiv.2501 2025
[36]

Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lam- bert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha, and Reyhaneh Jabbarvand. 2024. Lost in translation: A study of bugs introduced by large language models while translating code. InProceedings of the IEEE/ACM 46th International Conference on Software Enginee...

work page 2024
[37]

Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundare- san, Ming Zhou, Ambrosio Blanco, and Shuai Ma. 2020. Codebleu: a method for automatic evaluation of code synthesis.arXiv preprint arXiv:2009.10297(2020)

work page internal anchor Pith review arXiv 2020
[38]

Seshia, and Koushik Sen

Manish Shetty, Naman Jain, Adwait Godbole, Sanjit A. Seshia, and Koushik Sen

work page
[39]

arXiv:2412.14234 doi:10.48550/ ARXIV.2412.14234

Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis.CoRRabs/2412.14234 (2024). arXiv:2412.14234 doi:10.48550/ ARXIV.2412.14234

work page arXiv 2024
[41]

HoHyun Sim, Hyeonjoong Cho, Yeonghyeon Go, Zhoulai Fu, Ali Shokri, and Binoy Ravindran. 2025. Large Language Model-Powered Agent for C to Rust Code Translation.CoRRabs/2505.15858 (2025). arXiv:2505.15858 doi:10.48550/ ARXIV.2505.15858

work page internal anchor Pith review Pith/arXiv arXiv 2025
[42]

Chaozheng Wang, Jia Feng, Shuzheng Gao, Cuiyun Gao, Zongjie Li, Ting Peng, Hailiang Huang, Yuetang Deng, and Michael Lyu. 2025. Beyond PEFT: Layer-Wise Optimization for More Effective and Efficient Large Code Model Tuning.Proc. ACM Softw. Eng.2, FSE, Article FSE071 (June 2025), 24 pages. doi:10.1145/3729341

work page doi:10.1145/3729341 2025
[43]

Chaofan Wang, Guanjie Qiu, Xiaodong Gu, and Beijun Shen. 2025. APIRAT: Integrating Multi-source API Knowledge for Enhanced Code Translation with LLMs.arXiv preprint arXiv:2504.14852(2025)

work page arXiv 2025
[44]

Ruiqi Wang, Jiyu Guo, Cuiyun Gao, Guodong Fan, Chun Yong Chong, and Xin Xia. 2025. Can llms replace human evaluators? an empirical study of llm-as-a- judge in software engineering.Proceedings of the ACM on Software Engineering 2, ISSTA (2025), 1955–1977

work page 2025
[45]

Aidan Z. H. Yang, Yoshiki Takashima, Brandon Paulsen, Josiah Dodds, and Daniel Kroening. 2024. VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners. arXiv:2404.18852 [cs.PL] https: //arxiv.org/abs/2404.18852

work page arXiv 2024
[46]

Shiyu YANG, Yusheng GUO, Akihiro TABATA, and Yoshiki HIGO. 2025. Con- structing a Dataset of Functionally Equivalent Python Methods Using Test Gen- eration Techniques.IEICE Transactions on Information and Systems(2025)

work page 2025
[47]

Zhen Yang, Fang Liu, Zhongxing Yu, Jacky Wai Keung, Jia Li, Shuo Liu, Yifan Hong, Xiaoxue Ma, Zhi Jin, and Ge Li. 2024. Exploring and Unleashing the Power of Large Language Models in Automated Code Translation.Proc. ACM Softw. Eng.1, FSE (2024), 1585–1608. doi:10.1145/3660778

work page doi:10.1145/3660778 2024
[48]

Shangbo Yun, Shuhuai Lin, Xiaodong Gu, and Beijun Shen. 2024. Project-specific code summarization with in-context learning.Journal of Systems and Software 216 (2024), 112149

work page 2024
[49]

Hanliang Zhang, Cristina David, Yijun Yu, and Meng Wang. 2023. Ownership Guided C to Rust Translation. InComputer Aided Verification - 35th International Conference, CA V 2023, Paris, France, July 17-22, 2023, Proceedings, Part III (Lecture Notes in Computer Science, Vol. 13966), Constantin Enea and Akash Lal (Eds.). Springer, 459–482. doi:10.1007/978-3-0...

work page doi:10.1007/978-3-031-37709-9_22 2023
[50]

Ziyin Zhang, Chaoyu Chen, Bingchang Liu, Cong Liao, Zi Gong, Hang Yu, Jian- guo Li, and Rui Wang. 2024. Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code.Trans. Mach. Learn. Res. 2024 (2024). https://openreview.net/forum?id=hkNnGqZnpa

work page 2024
[51]

Tianyang Zhou, Haowen Lin, Somesh Jha, Mihai Christodorescu, Kirill Levchenko, and Varun Chandrasekaran. 2025. LLM-Driven Multi-step Translation from C to Rust using Static Analysis.CoRRabs/2503.12511 (2025). arXiv:2503.12511 doi:10.48550/ARXIV.2503.12511

work page doi:10.48550/arxiv.2503.12511 2025