arxiv: 2510.10956 · v3 · submitted 2025-10-13 · 💻 cs.SE · cs.AI

Project-Level C-to-Rust Translation via Pointer Knowledge Graphs

Zhiqiang Yuan , Wenjun Mao , Zhuo Chen , Xiyue Shang , Chong Wang , Yiling Lou , Xin Peng This is my paper

Pith reviewed 2026-05-18 08:22 UTC · model grok-4.3

classification 💻 cs.SE cs.AI

keywords C-to-Rust translationpointer knowledge graphlarge language modelsmemory safetyproject-level translationunsafe code reductionpointer semantics

0 comments p. Extension

The pith

A pointer knowledge graph gives LLMs the global view needed to translate entire C projects into safe Rust.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a Pointer Knowledge Graph that records how pointers flow through a full C project and lifts struct interactions to clearer abstractions. It also records Rust-specific details such as ownership, mutability, nullability, and lifetime constraints. When this graph is supplied to large language models, they can translate the project as a whole instead of function by function, avoiding the pointer errors that arise from missing context. A sympathetic reader would care because the result is Rust code that stays functionally correct while eliminating nearly all unsafe blocks.

Core claim

The authors claim that a Pointer Knowledge Graph, formed by adding pointer usage flows and Rust-oriented annotations to standard dependency graphs, supplies LLMs with enough global semantics to produce project-level C-to-Rust translations that are both functionally correct and almost free of unsafe constructs.

What carries the argument

The Pointer Knowledge Graph, which augments code dependency graphs with points-to flows, lifted struct interactions, and annotations for ownership, mutability, nullability, and lifetime.

If this is right

Project-level translations maintain dependencies across the entire codebase rather than isolating functions.
Generated Rust requires far fewer unsafe blocks than rule-based translators or standard LLM pipelines.
Functional correctness on test suites exceeds results from fuzzing-enhanced LLM baselines.
Pointer-related errors are reduced because the model sees usage patterns and lifetime constraints at once.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph structure could support translation between other pairs of languages that differ in memory safety rules.
Incremental updates to the graph might allow ongoing translation as a C project evolves over time.
Similar semantic graphs could address other cross-language issues such as concurrency or error handling.

Load-bearing premise

The pointer knowledge graph can be constructed accurately from the C source and supplies LLMs with enough global pointer information to generate functionally correct Rust without further manual fixes.

What would settle it

Running the method on a collection of C projects and finding that the output Rust still contains high rates of unsafe code or fails functional tests at rates comparable to earlier approaches would show the graph does not deliver the claimed benefits.

Figures

Figures reproduced from arXiv: 2510.10956 by Chong Wang, Wenjun Mao, Xin Peng, Xiyue Shang, Yiling Lou, Zhiqiang Yuan, Zhuo Chen.

**Figure 1.** Figure 1: Motivating Example PTRMAPPER translation performance. II. RELATED WORK Code translation has been extensively studied in existing literature[28], [29], [30], [31], [32], [33], [34]. Given the substantial differences among programming languages, code translation techniques are typically tailored to specific language pairs (e.g., C-to-Rust, Java-to-Python, Python-to-C). Therefore, in this section, we focus o… view at source ↗

**Figure 2.** Figure 2: Workflow of PTRMAPPER the structure pointed to by tree (i.e., quadtree t). After being passed to insert_, tree only reads its key_free member, while root accesses all its members (e.g., point and key). By leveraging this global usage information, LLMs extract key_free from quadtree_t as a separate function parameter, avoiding borrow conflicts and improving both the correctness and safety of the Rust transl… view at source ↗

**Figure 3.** Figure 3: Definition and Usage Context of string_table_t [59] Struct Downward analysis tracks the parameter’s usage within the current function and in any downstream functions when it is passed as an argument. Upward analysis considers all direct call sites, capturing how the parameter is used after being passed as an actual argument to the current function. • For function return pointers, we combine internal analys… view at source ↗

**Figure 4.** Figure 4: Code Translation Prompt insert_, the extracted semantic knowledge is expressed as triples such as <tree, pointsTo, quadtree_t> and <tree, isA, Borrowed>. Furthermore, if two pointer parameters within a function exhibit a derivesFrom relationship and different mutability, PTRMAPPER explicitly provides refactoring guidance of LLMs with the following format. Param 1 derivesFrom Param 2, where Param 1 requir… view at source ↗

**Figure 5.** Figure 5: Translation Order Based on Dependency Graph. Numbers Indicate the Translation Order. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Error Correction Prompt annotations (e.g., ¡root, isA, Owning¿) from the Rust copy of pointer KG. These elements are incorporated into a correction prompt for the LLM. The LLM resolves the type mismatch by replacing tree.root with tree.root.as_deref(), which preserves the ownership and borrowing semantics of the original program design. In contrast, relying solely on LLMs for repair based on error descript… view at source ↗

**Figure 7.** Figure 7: Error Correction Example tests for studied projects with the help from automatic generation tools (e.g., fuzzing and LLMs). To control the manual effort, we focus on C projects with fewer than 4,000 lines of code. As shown in Table II, our constructed tests achieve high coverage with 81.4% - 97.7% line coverage and 92.9% - 100.0% function coverage, ensuring a comprehensive functional equivalence evaluati… view at source ↗

read the original abstract

Translating C code into safe Rust is an effective way to ensure memory safety. Compared to rule-based approaches, which often produce largely unsafe Rust code, LLM-based methods generate more idiomatic and safer Rust by leveraging extensive training on human-written code. Despite their promise, existing LLM-based approaches still struggle with project-level C-to-Rust translation. They typically partition a C project into smaller units (e.g., functions) based on call graphs and translate them in a bottom-up manner to resolve dependencies. However, this unit-by-unit paradigm often fails to handle pointers due to the lack of a global view of their usage. To address this limitation, we propose a novel C-to-Rust Pointer Knowledge Graph (KG) that augments code dependency graphs with two types of pointer semantics: (i) pointer usage information, which captures global behaviors such as points-to flows and lifts low-level struct interactions to higher-level abstractions; and (ii) Rust-oriented annotations, which encode ownership, mutability, nullability, and lifetime. Building on this KG, we further propose PtrTrans, a project-level C-to-Rust translation approach. In PtrTrans, the KG provides LLMs with comprehensive global pointer semantics, guiding them to generate safe and idiomatic Rust code. Experimental results show that PtrTrans reduces unsafe usages in translated Rust by 99.9% compared to both rule-based and conventional LLM-based methods, while achieving 29.3% higher functional correctness than fuzzing-enhanced LLM approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The Pointer KG is a sensible new structure for global pointer context in C-to-Rust translation, but the abstract's big performance claims rest on unspecified construction details and thin evaluation evidence.

read the letter

The paper's main contribution is a Pointer Knowledge Graph that adds pointer usage flows, lifted struct relations, and Rust annotations for ownership and lifetimes to ordinary dependency graphs. This gives the LLM a project-wide view instead of translating function by function from the bottom up. That framing directly targets a known weakness in current LLM-based C-to-Rust work, and the idea itself is cleanly described in the abstract.

Referee Report

3 major / 2 minor

Summary. The manuscript presents PtrTrans, a project-level C-to-Rust translation approach that constructs a Pointer Knowledge Graph (KG) to capture global pointer semantics such as points-to flows, struct abstractions, ownership, mutability, nullability, and lifetimes. This KG is used to guide LLMs in generating safe and functionally correct Rust code from C projects. The authors claim that PtrTrans achieves a 99.9% reduction in unsafe Rust usages compared to rule-based and standard LLM methods, and a 29.3% improvement in functional correctness over fuzzing-enhanced LLM approaches.

Significance. If the experimental claims hold and the KG construction proves robust, this work could significantly advance automated translation of legacy C code to memory-safe Rust by providing LLMs with structured global pointer information at the project level. This addresses a key limitation in existing unit-by-unit translation methods. The approach combines static analysis with LLM prompting in a novel way that may reduce manual fixes needed for safety and correctness.

major comments (3)

[Abstract] Abstract: The abstract reports 99.9% reduction in unsafe usages and 29.3% higher functional correctness, but provides no details on the datasets, projects used, baseline implementations, or how statistical significance was determined. This information is essential to evaluate the central experimental claims.
[Section 3] Section 3 (Pointer Knowledge Graph construction): The method for building the KG via static analysis to extract points-to flows, ownership, and lifetimes is not specified in detail, including handling of undecidable C constructs such as void*, unions, and function pointers. Since the central claim depends on the KG supplying accurate global semantics to the LLM, this omission is load-bearing.
[Section 5] Section 5 (Experimental results): The evaluation lacks description of how functional correctness was measured (e.g., test suites or fuzzing protocols) and how unsafe code usages were quantified across projects, making it impossible to verify the reported gains or rule out benchmark-specific artifacts.

minor comments (2)

[Introduction] The paper would benefit from an explicit definition of 'unsafe usages' and 'functional correctness' early in the text to avoid ambiguity in later sections.
[Figure 2] Figure captions for the KG diagrams could include more detail on the node and edge types to improve readability for readers unfamiliar with the specific abstractions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We respond to each major comment below and indicate the revisions we will make to enhance clarity and completeness.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract reports 99.9% reduction in unsafe usages and 29.3% higher functional correctness, but provides no details on the datasets, projects used, baseline implementations, or how statistical significance was determined. This information is essential to evaluate the central experimental claims.

Authors: We agree that the abstract is concise by design and therefore omits granular experimental details. The datasets, specific C projects, baseline implementations, and evaluation methodology (including how statistical significance was assessed) are fully described in Section 5. To improve reader accessibility, we will revise the abstract to include a brief reference to the evaluation benchmarks and projects. revision: yes
Referee: [Section 3] Section 3 (Pointer Knowledge Graph construction): The method for building the KG via static analysis to extract points-to flows, ownership, and lifetimes is not specified in detail, including handling of undecidable C constructs such as void*, unions, and function pointers. Since the central claim depends on the KG supplying accurate global semantics to the LLM, this omission is load-bearing.

Authors: Section 3 outlines the static analysis pipeline used to construct the Pointer Knowledge Graph, including extraction of points-to flows and Rust-oriented annotations. We acknowledge that additional detail on the treatment of undecidable constructs would strengthen the presentation. We will expand Section 3 with explicit discussion of our handling of void*, unions, and function pointers, including the conservative approximations employed where precise analysis is undecidable. revision: yes
Referee: [Section 5] Section 5 (Experimental results): The evaluation lacks description of how functional correctness was measured (e.g., test suites or fuzzing protocols) and how unsafe code usages were quantified across projects, making it impossible to verify the reported gains or rule out benchmark-specific artifacts.

Authors: Section 5 describes the functional correctness evaluation, which relies on project-provided test suites supplemented by fuzzing protocols, and quantifies unsafe usages via counts of unsafe blocks and raw pointer operations in the generated Rust code. We will revise Section 5 to make these measurement procedures more explicit, including additional examples of the test suites and quantification process, to facilitate independent verification. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper proposes constructing a Pointer Knowledge Graph to augment dependency graphs with global pointer semantics (points-to flows, ownership, mutability, nullability, lifetimes) and then uses this KG to prompt LLMs for project-level C-to-Rust translation. No equations, fitted parameters, self-citations, or uniqueness theorems are described that would reduce any claimed result or prediction back to the inputs by construction. The experimental claims of 99.9% unsafe reduction and 29.3% correctness gain are presented as evaluation outcomes on benchmarks rather than tautological redefinitions of the method itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim depends on the LLM being able to interpret and act on the supplied global pointer semantics; the knowledge graph itself is an invented construct whose accuracy is not independently verified in the abstract.

axioms (1)

domain assumption Large language models can reliably translate C to safe Rust when supplied with explicit global pointer usage and Rust-oriented annotations.
The method assumes LLMs will correctly apply the graph information to avoid unsafe patterns while preserving functionality.

invented entities (1)

Pointer Knowledge Graph no independent evidence
purpose: Augment code dependency graphs with points-to flows, struct abstractions, and Rust annotations for ownership, mutability, nullability, and lifetime.
This graph is introduced to solve the lack of global pointer view in bottom-up translation.

pith-pipeline@v0.9.0 · 5812 in / 1344 out tokens · 43575 ms · 2026-05-18T08:22:53.598681+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we propose a novel C-Rust Pointer Knowledge Graph (KG) that augments code dependency graphs with two types of pointer semantics: (i) pointer usage information, which captures global behaviors such as points-to flows...
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Rust-oriented annotations, which encode ownership, mutability, nullability, and lifetime

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ReCodeAgent: A Multi-Agent Workflow for Language-agnostic Translation and Validation of Large-scale Repositories
cs.SE 2026-04 unverdicted novelty 7.0

ReCodeAgent uses a multi-agent system to translate and validate large code repositories across multiple programming languages, achieving 60.8% higher test pass rates than prior neuro-symbolic and agentic methods on 11...
CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora
cs.SE 2026-04 unverdicted novelty 6.0

CodePivot uses Python as a pivot language plus an Aggressive-Partial-Functional RL reward to train a 7B model that outperforms much larger LLMs on multilingual code transpilation without parallel corpora.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · cited by 2 Pith papers · 7 internal anchors

[1]

How ISO C became unusable for operating systems development,

V . Yodaiken, “How ISO C became unusable for operating systems development,”CoRR, vol. abs/2201.07845, 2022. [Online]. Available: https://arxiv.org/abs/2201.07845

work page arXiv 2022
[2]

A Survey of Embedded Software Profiling Methodologies

R. Patel and A. Rajawat, “A survey of embedded software profiling methodologies,”CoRR, vol. abs/1312.2949, 2013. [Online]. Available: http://arxiv.org/abs/1312.2949

work page internal anchor Pith review Pith/arXiv arXiv 2013
[3]

Ac/c++ code vulnerability dataset with code changes and cve summaries,

J. Fan, Y . Li, S. Wang, and T. N. Nguyen, “Ac/c++ code vulnerability dataset with code changes and cve summaries,” inProceedings of the 17th international conference on mining software repositories, 2020, pp. 508–512

work page 2020
[4]

Rust won’t save us: An analysis of 2023’s known exploited vulnerabilities,

Z. Hanley, “Rust won’t save us: An analysis of 2023’s known exploited vulnerabilities,” 2023

work page 2023
[5]

AEGIS: towards formalized and practical memory-safe execution of C programs via MSW ASM,

S. Esmaeilsabzali, A. Khalatyan, Z. Mo, S. Venkatanarayanan, and S. Xu, “AEGIS: towards formalized and practical memory-safe execution of C programs via MSW ASM,”CoRR, vol. abs/2503.03698,

work page arXiv
[6]

AEGIS: towards formalized and practical memory-safe execution of C programs via MSW ASM,

[Online]. Available: https://doi.org/10.48550/arXiv.2503.03698

work page doi:10.48550/arxiv.2503.03698
[7]

A closer look at the security risks in the rust ecosystem,

X. Zheng, Z. Wan, Y . Zhang, R. Chang, and D. Lo, “A closer look at the security risks in the rust ecosystem,”ACM Trans. Softw. Eng. Methodol., vol. 33, no. 2, pp. 34:1–34:30, 2024. [Online]. Available: https://doi.org/10.1145/3624738

work page doi:10.1145/3624738 2024
[8]

Memory-safety challenge considered solved? an in-depth study with all rust cves,

H. Xu, Z. Chen, M. Sun, Y . Zhou, and M. R. Lyu, “Memory-safety challenge considered solved? an in-depth study with all rust cves,” ACM Trans. Softw. Eng. Methodol., vol. 31, no. 1, pp. 3:1–3:25, 2022. [Online]. Available: https://doi.org/10.1145/3466642

work page doi:10.1145/3466642 2022
[9]

On the dual nature of necessity in use of rust unsafe code,

Y . Zhang, A. Kundu, G. Portokalidis, and J. Xu, “On the dual nature of necessity in use of rust unsafe code,” inProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2023, San Francisco, CA, USA, December 3-9, 2023, S. Chandra, K. Blincoe, and P. Tonella, Eds. ACM, 20...

work page doi:10.1145/3611643.3613878 2023
[10]

The rust language,

N. D. Matsakis and F. S. Klock, “The rust language,” inProceedings of the 2014 ACM SIGAda annual conference on High integrity language technology, 2014, pp. 103–104

work page 2014
[11]

for Linux

R. for Linux. [Online]. Available: https://github.com/Rust-for-Linux/ linux

work page
[12]

[Online]

c2rust, 2025. [Online]. Available: https://github.com/immunant/c2rust

work page 2025
[13]

Ownership guided C to rust translation,

H. Zhang, C. David, Y . Yu, and M. Wang, “Ownership guided C to rust translation,” inComputer Aided Verification - 35th International Conference, CAV 2023, Paris, France, July 17-22, 2023, Proceedings, Part III, ser. Lecture Notes in Computer Science, vol. 13966. Springer, 2023, pp. 459–482. [Online]. Available: https://doi.org/10.1007/978-3-031-37709-9 22

work page doi:10.1007/978-3-031-37709-9 2023
[14]

[Online]

citrus, 2025. [Online]. Available: https://gitlab.com/citrus-rs/citrus# citrus-convert-c-to-rust

work page 2025
[15]

Translating C to safer rust,

M. Emre, R. Schroeder, K. Dewey, and B. Hardekopf, “Translating C to safer rust,”Proc. ACM Program. Lang., vol. 5, no. OOPSLA, pp. 1–29, 2021. [Online]. Available: https://doi.org/10.1145/3485498

work page doi:10.1145/3485498 2021
[16]

Aliasing limits on translating C to safe rust,

M. Emre, P. Boyland, A. Parekh, R. Schroeder, K. Dewey, and B. Hardekopf, “Aliasing limits on translating C to safe rust,”Proc. ACM Program. Lang., vol. 7, no. OOPSLA1, pp. 551–579, 2023. [Online]. Available: https://doi.org/10.1145/3586046

work page doi:10.1145/3586046 2023
[17]

Safemd: Ownership-based safe memory deallocation for c programs,

X. Yin, Z. Huang, S. Kan, and G. Shen, “Safemd: Ownership-based safe memory deallocation for c programs,” 2024

work page 2024
[18]

Raw Pointer Rewriting with LLMs for Translating C to Safer Rust

Y . Gao, C. Wang, P. Huang, X. Liu, M. Zheng, and X. Zhang, “PR2: peephole raw pointer rewriting with llms for translating C to safer rust,”CoRR, vol. abs/2505.04852, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2505.04852

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.04852 2025
[19]

C2saferrust: Transforming C projects into safer rust with neurosymbolic techniques,

V . Nitin, R. Krishna, L. L. do Valle, and B. Ray, “C2saferrust: Transforming C projects into safer rust with neurosymbolic techniques,” CoRR, vol. abs/2501.14257, 2025. [Online]. Available: https://doi.org/ 10.48550/arXiv.2501.14257

work page doi:10.48550/arxiv.2501.14257 2025
[20]

Don’t write, but return: Replacing output parameters with algebraic data types in c-to-rust translation,

J. Hong and S. Ryu, “Don’t write, but return: Replacing output parameters with algebraic data types in c-to-rust translation,”Proc. ACM Program. Lang., vol. 8, no. PLDI, pp. 716–740, 2024. [Online]. Available: https://doi.org/10.1145/3656406

work page doi:10.1145/3656406 2024
[21]

Towards translating real-world code with llms: A study of translating to rust,

H. F. Eniser, H. Zhang, C. David, M. Wang, M. Christakis, B. Paulsen, J. Dodds, and D. Kroening, “Towards translating real-world code with llms: A study of translating to rust,”CoRR, vol. abs/2405.11514, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2405.11514

work page doi:10.48550/arxiv.2405.11514 2024
[22]

VERT: verified equivalent rust transpilation with few- shot learning,

A. Z. H. Yang, Y . Takashima, B. Paulsen, J. Dodds, and D. Kroening, “VERT: verified equivalent rust transpilation with few- shot learning,”CoRR, vol. abs/2404.18852, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2404.18852

work page doi:10.48550/arxiv.2404.18852 2024
[23]

Syzygy: Dual code-test C to (safe) rust translation using llms and dynamic analysis,

M. Shetty, N. Jain, A. Godbole, S. A. Seshia, and K. Sen, “Syzygy: Dual code-test C to (safe) rust translation using llms and dynamic analysis,”CoRR, vol. abs/2412.14234, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2412.14234

work page doi:10.48550/arxiv.2412.14234 2024
[24]

Rustmap: Towards project-scale c-to-rust migration via program analysis and LLM,

X. Cai, J. Liu, X. Huang, Y . Yu, H. Wu, C. Li, B. Wang, I. N. B. Yusuf, and L. Jiang, “Rustmap: Towards project-scale c-to-rust migration via program analysis and LLM,”CoRR, vol. abs/2503.17741, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2503.17741

work page doi:10.48550/arxiv.2503.17741 2025
[25]

Context-aware code segmentation for c-to-rust translation using large language models,

M. Shiraishi and T. Shinagawa, “Context-aware code segmentation for c-to-rust translation using large language models,”CoRR, vol. abs/2409.10506, 2024. [Online]. Available: https://doi.org/10.48550/ arXiv.2409.10506

work page arXiv 2024
[26]

Type-migrating c-to-rust translation using a large language model,

J. Hong and S. Ryu, “Type-migrating c-to-rust translation using a large language model,”Empir. Softw. Eng., vol. 30, no. 1, p. 3, 2025. [Online]. Available: https://doi.org/10.1007/s10664-024-10573-2

work page doi:10.1007/s10664-024-10573-2 2025
[27]

[Online]

Quadtree. [Online]. Available: https://github.com/thejefflarson/quadtree

work page
[28]

[Online]

openAI. [Online]. Available: https://openai.com/

work page
[29]

Lost in translation: A study of bugs introduced by large language models while translating code,

R. Pan, A. R. Ibrahimzada, R. Krishna, D. Sankar, L. P. Wassi, M. Merler, B. Sobolev, R. Pavuluri, S. Sinha, and R. Jabbarvand, “Lost in translation: A study of bugs introduced by large language models while translating code,” inProceedings of the 46th IEEE/ACM International Conference on Software Engineering, ICSE 2024, Lisbon, Portugal, April 14-20, 202...

work page doi:10.1145/3597503.3639226 2024
[30]

Exploring and unleashing the power of large language models in automated code translation,

Z. Yang, F. Liu, Z. Yu, J. W. Keung, J. Li, S. Liu, Y . Hong, X. Ma, and Z. J. andns Ge Li, “Exploring and unleashing the power of large language models in automated code translation,”Proc. ACM Softw. Eng., vol. 1, no. FSE, pp. 1585–1608, 2024. [Online]. Available: https://doi.org/10.1145/3660778

work page doi:10.1145/3660778 2024
[31]

Alphatrans: A neuro-symbolic compositional approach for repository-level code translation and validation,

A. R. Ibrahimzada, K. Ke, M. Pawagi, M. S. Abid, R. Pan, S. Sinha, and R. Jabbarvand, “Alphatrans: A neuro-symbolic compositional approach for repository-level code translation and validation,”Proc. ACM Softw. Eng., vol. 2, no. FSE, pp. 2454–2476, 2025. [Online]. Available: https://doi.org/10.1145/3729379

work page doi:10.1145/3729379 2025
[32]

Repotransagent: Multi-agent llm framework for repository-aware code translation,

Z. Guan, X. Yin, Z. Peng, and C. Ni, “Repotransagent: Multi-agent llm framework for repository-aware code translation,” 2025. [Online]. Available: https://arxiv.org/abs/2508.17720

work page arXiv 2025
[33]

Scalable, validated code translation of entire projects using large language models,

H. Zhang, C. David, M. Wang, B. Paulsen, and D. Kroening, “Scalable, validated code translation of entire projects using large language models,”CoRR, vol. abs/2412.08035, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2412.08035

work page doi:10.48550/arxiv.2412.08035 2024
[34]

Unsupervised translation of programming languages,

B. Rozi `ere, M. Lachaux, L. Chanussot, and G. Lample, “Unsupervised translation of programming languages,” inAdvances in Neural Informa- tion Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020

work page 2020
[35]

Enhancing code translation in language models with few- shot learning via retrieval-augmented generation,

M. Bhattarai, J. E. Santos, S. Jones, A. Biswas, B. Alexandrov, and D. O’Malley, “Enhancing code translation in language models with few- shot learning via retrieval-augmented generation,” in2024 IEEE High Performance Extreme Computing Conference (HPEC), 2024, pp. 1–8

work page 2024
[36]

Compiling C to safe rust, formalized,

A. Fromherz and J. Protzenko, “Compiling C to safe rust, formalized,”CoRR, vol. abs/2412.15042, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2412.15042

work page doi:10.48550/arxiv.2412.15042 2024
[37]

Migrating C to rust for memory safety,

P. Larsen, “Migrating C to rust for memory safety,”IEEE Secur. Priv., vol. 22, no. 4, pp. 22–29, 2024. [Online]. Available: https://doi.org/10.1109/MSEC.2024.3385357

work page doi:10.1109/msec.2024.3385357 2024
[38]

[Online]

Corrode, 2017. [Online]. Available: https://github.com/jameysharp/ corrode

work page 2017
[39]

In rust we trust - A transpiler from unsafe C to safer rust,

M. Ling, Y . Yu, H. Wu, Y . Wang, J. R. Cordy, and A. E. Hassan, “In rust we trust - A transpiler from unsafe C to safer rust,” in 44th IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2022, Pittsburgh, PA, USA, May 22-24, 2022. ACM/IEEE, 2022, pp. 354–355. [Online]. Available: https://doi.org/10.1145/351045...

work page doi:10.1145/3510454.3528640 2022
[40]

To tag, or not to tag: Translating c’s unions to rust’s tagged unions,

J. Hong and S. Ryu, “To tag, or not to tag: Translating c’s unions to rust’s tagged unions,” inProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, ASE 2024, Sacramento, CA, USA, October 27 - November 1, 2024, V . Filkov, B. Ray, and M. Zhou, Eds. ACM, 2024, pp. 40–52. [Online]. Available: https://doi.org/10.1145/36...

work page doi:10.1145/3691620.3694985 2024
[41]

Forcrat: Automatic I/O API translation from C to rust via origin and capability analysis,

——, “Forcrat: Automatic I/O API translation from C to rust via origin and capability analysis,”CoRR, vol. abs/2506.01427, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2506.01427

work page doi:10.48550/arxiv.2506.01427 2025
[42]

Crust-bench: A comprehensive benchmark for c-to-safe-rust transpilation,

A. Khatry, R. Zhang, J. Pan, Z. Wang, Q. Chen, G. Durrett, and I. Dillig, “Crust-bench: A comprehensive benchmark for c-to-safe-rust transpilation,”CoRR, vol. abs/2504.15254, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2504.15254

work page doi:10.48550/arxiv.2504.15254 2025
[43]

Translating C to rust: Lessons from a user study,

R. Li, B. Wang, T. Li, P. Saxena, and A. Kundu, “Translating C to rust: Lessons from a user study,” in32nd Annual Network and Distributed System Security Symposium, NDSS 2025, San Diego, California, USA, February 24-28, 2025. The Internet Society,

work page 2025
[44]

Available: https://www.ndss-symposium.org/ndss-paper/ translating-c-to-rust-lessons-from-a-user-study/

[Online]. Available: https://www.ndss-symposium.org/ndss-paper/ translating-c-to-rust-lessons-from-a-user-study/

work page
[45]

C2rusttv: An llm-based frame- work for c to rust translation and validation,

H. Zhou, Y . Luo, M. Zhang, and D. Xu, “C2rusttv: An llm-based frame- work for c to rust translation and validation,” in2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC), 2025, pp. 1254–1259

work page 2025
[46]

Llm-driven multi-step translation from C to rust using static analysis,

T. Zhou, H. Lin, S. Jha, M. Christodorescu, K. Levchenko, and V . Chandrasekaran, “Llm-driven multi-step translation from C to rust using static analysis,”CoRR, vol. abs/2503.12511, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2503.12511

work page doi:10.48550/arxiv.2503.12511 2025
[47]

Search-Based Multi-Trajectory Refinement for Safe C-to-Rust Translation with Large Language Models

H. Sim, H. Cho, Y . Go, Z. Fu, A. Shokri, and B. Ravindran, “Large language model-powered agent for C to rust code translation,” CoRR, vol. abs/2505.15858, 2025. [Online]. Available: https://doi.org/ 10.48550/arXiv.2505.15858

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.15858 2025
[48]

Toward llm-based large-scale c-to-rust code translation

M. Shiraishi and T. Shinagawa, “Toward llm-based large-scale c-to-rust code translation.”

work page
[49]

Evaluating instruction-tuned large language models on code comprehension and generation,

Z. Yuan, J. Liu, Q. Zi, M. Liu, X. Peng, and Y . Lou, “Evaluating instruction-tuned large language models on code comprehension and generation,”CoRR, vol. abs/2308.01240, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2308.01240

work page doi:10.48550/arxiv.2308.01240 2023
[50]

Self-collaboration code generation via chatgpt,

Y . Dong, X. Jiang, Z. Jin, and G. Li, “Self-collaboration code generation via chatgpt,”ACM Trans. Softw. Eng. Methodol., vol. 33, no. 7, pp. 189:1–189:38, 2024. [Online]. Available: https://doi.org/10. 1145/3672459

work page 2024
[51]

Evaluating and improving chatgpt for unit test generation,

Z. Yuan, M. Liu, S. Ding, K. Wang, Y . Chen, X. Peng, and Y . Lou, “Evaluating and improving chatgpt for unit test generation,”Proc. ACM Softw. Eng., vol. 1, no. FSE, pp. 1703–1726, 2024. [Online]. Available: https://doi.org/10.1145/3660783

work page doi:10.1145/3660783 2024
[52]

Automated repair of programs from large language models,

Z. Fan, X. Gao, M. Mirchev, A. Roychoudhury, and S. H. Tan, “Automated repair of programs from large language models,” in 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 2023, pp. 1469–1481. [Online]. Available: https://doi.org/10.1109/ICSE48619. 2023.00128

work page doi:10.1109/icse48619 2023
[53]

Automated program repair in the era of large pre-trained language models,

C. S. Xia, Y . Wei, and L. Zhang, “Automated program repair in the era of large pre-trained language models,” in45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 2023, pp. 1482–1494. [Online]. Available: https://doi.org/10.1109/ICSE48619.2023.00129

work page doi:10.1109/icse48619.2023.00129 2023
[54]

Deep Learning-based Code Reviews: A Paradigm Shift or a Double-Edged Sword?

I. Bouzenia, P. T. Devanbu, and M. Pradel, “Repairagent: An autonomous, llm-based agent for program repair,” in47th IEEE/ACM International Conference on Software Engineering, ICSE 2025, Ottawa, ON, Canada, April 26 - May 6, 2025. IEEE, 2025, pp. 2188–2200. [Online]. Available: https://doi.org/10.1109/ICSE55347.2025.00157

work page doi:10.1109/icse55347.2025.00157 2025
[55]

TransAgent: Enhancing LLM-Based Code Translation via Fine-Grained Execution Alignment

Z. Yuan, W. Chen, H. Wang, K. Yu, X. Peng, and Y . Lou, “TRANSAGENT: an llm-based multi-agent system for code translation,” CoRR, vol. abs/2409.19894, 2024. [Online]. Available: https://doi.org/ 10.48550/arXiv.2409.19894

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2409.19894 2024
[56]

Enhancing llm-based code translation in repository context via triple knowledge-augmented,

G. Ou, M. Liu, Y . Chen, X. Du, S. Wang, Z. Zhang, X. Peng, and Z. Zheng, “Enhancing llm-based code translation in repository context via triple knowledge-augmented,”CoRR, vol. abs/2503.18305, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2503.18305

work page doi:10.48550/arxiv.2503.18305 2025
[57]

Repository-level code translation benchmark targeting rust,

G. Ou, M. Liu, Y . Chen, X. Peng, and Z. Zheng, “Repository-level code translation benchmark targeting rust,”CoRR, vol. abs/2411.13990,

work page arXiv
[58]

Repository-level code translation benchmark targeting rust,

[Online]. Available: https://doi.org/10.48550/arXiv.2411.13990

work page doi:10.48550/arxiv.2411.13990
[59]

SafeTrans: LLM-assisted Transpilation from C to Rust

M. Farrukh, S. Shah, B. Coskun, and M. Polychronakis, “Safetrans: Llm-assisted tran2spilation from C to rust,”CoRR, vol. abs/2505.10708,

work page internal anchor Pith review Pith/arXiv arXiv
[60]

SafeTrans: LLM-assisted Transpilation from C to Rust

[Online]. Available: https://doi.org/10.48550/arXiv.2505.10708

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.10708
[61]

Spectra: Enhancing the code translation ability of language models by generating multi-modal specifications,

V . Nitin and B. Ray, “Spectra: Enhancing the code translation ability of language models by generating multi-modal specifications,”CoRR, vol. abs/2405.18574, 2024. [Online]. Available: https://doi.org/10.48550/ arXiv.2405.18574

work page arXiv 2024
[62]

The protection of information in computer systems,

J. H. Saltzer and M. D. Schroeder, “The protection of information in computer systems,”Proc. IEEE, vol. 63, no. 9, pp. 1278–1308, 1975. [Online]. Available: https://doi.org/10.1109/PROC.1975.9939

work page doi:10.1109/proc.1975.9939 1975
[63]

[Online]

libtree. [Online]. Available: https://github.com/haampie/libtree/blob/ master/libtree.c#L191

work page
[64]

Depth-first search and linear graph algorithms,

R. E. Tarjan, “Depth-first search and linear graph algorithms,”SIAM J. Comput., vol. 1, no. 2, pp. 146–160, 1972. [Online]. Available: https://doi.org/10.1137/0201010

work page doi:10.1137/0201010 1972
[65]

[Online]

quadtree point free. [Online]. Available: https://github.com/ thejefflarson/quadtree/blob/master/src/point.c#L14

work page
[66]

[Online]

bzip2. [Online]. Available: https://github.com/commontk/bzip2/blob/ master

work page
[67]

[Online]

libcsv. [Online]. Available: https://github.com/rgamble/libcsv/tree/master

work page
[68]

[Online]

robotfindskitten. [Online]. Available: https://github.com/robotfindskitten/ robotfindskitten

work page
[69]

[Online]

Clippy. [Online]. Available: https://github.com/rust-lang/rust-clippy

work page
[70]

[Online]

cargo geiger. [Online]. Available: https://docs.rs/crate/cargo-geiger/latest

work page
[71]

[Online]

Doxygen. [Online]. Available: https://doxygen.nl/

work page
[72]

[Online]

SVF. [Online]. Available: https://github.com/SVF-tools/SVF

work page
[73]

Analysis of chatgpt-generated codes across multiple programming languages,

S. Almanasra and K. Suwais, “Analysis of chatgpt-generated codes across multiple programming languages,”IEEE Access, vol. 13, pp. 23 580–23 596, 2025. [Online]. Available: https://doi.org/10.1109/ ACCESS.2025.3538050

work page arXiv 2025
[74]

On iterative evaluation and enhancement of code quality using gpt-4o,

R. Liu, A. Frade, A. Vaidya, M. Labonne, M. Kaiser, B. Chakrabarti, J. Budd, and S. J. Moran, “On iterative evaluation and enhancement of code quality using gpt-4o,”CoRR, vol. abs/2502.07399, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2502.07399

work page doi:10.48550/arxiv.2502.07399 2025
[75]

Evaluation of gpt 4o for mobile applications code conversion,

A. Mashaal, O. Helmy, O. Ashor, A. T. Mahmoud, and WalaaMedhat, “Evaluation of gpt 4o for mobile applications code conversion,” in2024 12th International Japan-Africa Conference on Electronics, Communi- cations, and Computations (JAC-ECC), 2024, pp. 219–224

work page 2024
[76]

Repotransbench: A real-world benchmark for repository-level code translation,

Y . Wang, Y . Wang, S. Wang, D. Guo, J. Chen, J. C. Grundy, X. Liu, Y . Ma, M. Mao, H. Zhang, and Z. Zheng, “Repotransbench: A real-world benchmark for repository-level code translation,”CoRR, vol. abs/2412.17744, 2024. [Online]. Available: https://doi.org/10.48550/ arXiv.2412.17744

work page arXiv 2024
[77]

[Online]

Claude. [Online]. Available: https://www.anthropic.com/index/ introducing-claude

work page
[78]

[Online]

Gemini. [Online]. Available: https://blog.google/technology/ai/ google-gemini-ai/

work page
[79]

Text rendering strategies for pixel language models

W. Yan, Y . Tian, Y . Li, Q. Chen, and W. Wang, “Codetransocean: A comprehensive multilingual benchmark for code translation,” in Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, H. Bouamor, J. Pino, and K. Bali, Eds. Association for Computational Linguistics, 2023, pp. 5067–5089. [Online]. Available: ...

work page doi:10.18653/v1/2023 2023
[80]

CodeBLEU: a Method for Automatic Evaluation of Code Synthesis

S. Ren, D. Guo, S. Lu, L. Zhou, S. Liu, D. Tang, N. Sundaresan, M. Zhou, A. Blanco, and S. Ma, “Codebleu: a method for automatic evaluation of code synthesis,”CoRR, vol. abs/2009.10297, 2020. [Online]. Available: https://arxiv.org/abs/2009.10297

work page internal anchor Pith review Pith/arXiv arXiv 2009