arxiv: 2604.12048 · v1 · submitted 2026-04-13 · 💻 cs.SE

Recognition: unknown

ORBIT: Guided Agentic Orchestration for Autonomous C-to-Rust Transpilation

Muhammad Farrukh , Baris Coskun , Tapti Palit , Michalis Polychronakis

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:14 UTC · model grok-4.3

classification 💻 cs.SE

keywords agentic orchestrationC-to-Rust transpilationdynamic context collectiondependency-guided translationiterative verificationautonomous code migrationproject-level translationmemory safety

0 comments

The pith

Guided agentic orchestration with dynamic context collection enables autonomous project-level C-to-Rust translation achieving full compilation success.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper seeks to establish that an autonomous agentic framework using dynamic context collection, dependency-guided orchestration, and iterative verification can reliably translate large C codebases to Rust. A sympathetic reader would care because prior methods using static context or off-the-shelf agents often fail on complex cross-module dependencies and produce incomplete or hallucinatory translations. If the claim holds, it opens a path to automated, large-scale migration of legacy C systems to Rust, improving memory safety across software infrastructure. The work demonstrates this through evaluations on substantial programs where the system succeeds without manual intervention in most cases.

Core claim

The paper claims that by constructing a dependency-aware translation graph and coordinating specialized agents to collect context dynamically, generate Rust interfaces, map functions, and verify iteratively, the framework achieves 100 percent compilation success and 91.7 percent test success on 24 programs where 91.7 percent exceed 1,000 lines of code while reducing unsafe Rust code blocks to nearly zero.

What carries the argument

The dependency-aware translation graph that guides the orchestration of multiple specialized agents for dynamic context curation, interface generation, function mapping, and iterative verification.

If this is right

Translations of programs with complex dependencies can proceed autonomously without static context breakdowns.
The resulting Rust code contains nearly zero unsafe blocks.
The approach performs equally in settings with expert-provided interfaces and with automatically generated interfaces.
High success rates hold for programs exceeding one thousand lines of code.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The orchestration method could extend to automated translations between other pairs of programming languages.
Widespread adoption might speed up replacement of legacy unsafe code with memory-safe alternatives in large systems.
Additional specialization of agents could further reduce errors in dependency navigation for even larger codebases.

Load-bearing premise

Dynamic context collection paired with dependency-guided orchestration and iterative verification will handle intricate cross-module dependencies to yield complete and correct translations without requiring manual fixes or allowing undetected errors.

What would settle it

Applying the framework to a program with highly interconnected modules and checking whether it produces fully compilable and test-passing Rust translations without manual corrections or hidden issues would confirm or refute the central claim.

Figures

Figures reproduced from arXiv: 2604.12048 by Baris Coskun, Michalis Polychronakis, Muhammad Farrukh, Tapti Palit.

**Figure 1.** Figure 1: Overview of the ORBIT. functions within the same source or header file (module); and (b) an Inter-Module Graph, which tracks high-level dependencies between Module Groups, defined as logical groupings of related .c and .h files. To ensure that foundational entities are processed before those that depend on them (a requirement for both correct translation order and successful compilation) ORBIT applies Kahn… view at source ↗

**Figure 2.** Figure 2: Two-tier function mapping strategy in ORBIT. counterparts, returning a structured JSON response identifying the matched rust_module and rust_function. In cases where no Rust equivalent exists, such as C deallocator functions that have no counterpart in Rust due to its built-in ownership and deallocation model, the Mapping Agent assigns null to the fields. Since LLMs are inherently prone to hallucination, e… view at source ↗

read the original abstract

Large-scale migration of legacy C code to Rust offers a promising path toward improving memory safety, but LLM-based C-to-Rust translation remains challenging due to limited context windows and hallucinations. Prior approaches are evaluated primarily on small programs or datasets skewed toward small codebases, providing limited insight into scalability on real-world systems. They also rely on static context construction, which breaks down in the presence of complex cross-module dependencies and often requires manual intervention. Recent coding agents offer a promising alternative through dynamic codebase navigation and context curation. When used out of the box, however, they frequently produce incomplete translations that appear superficially correct. We present ORBIT, an autonomous agentic framework for project-level C-to-Rust translation that combines dynamic context collection with dependency-guided orchestration and iterative verification. ORBIT constructs a dependency-aware translation graph, generates Rust interfaces, maps C functions to Rust counterparts, and coordinates multiple specialized agents. We evaluate ORBIT on 24 programs from CRUST-Bench, with 91.7% of the programs exceeding 1,000 lines of code. ORBIT achieves 100% compilation success and 91.7% test success in both expert-interface and automatically generated-interface settings, substantially outperforming C2Rust and CRUST-Bench, while reducing unsafe Rust code blocks to nearly zero. We further evaluate ORBIT on challenging cases from the DARPA TRACTOR benchmark, where it achieves competitive performance relative to participating systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ORBIT shows solid gains on mid-sized C-to-Rust projects via agent orchestration and a dependency graph, but the evaluation leaves the reliability of dynamic context collection under-examined.

read the letter

ORBIT's main takeaway is that a multi-agent setup guided by a dependency-aware translation graph can push C-to-Rust translation to programs over 1k lines with 100% compilation success and 91.7% test success on the CRUST-Bench set, plus competitive results on DARPA TRACTOR cases. It also cuts unsafe Rust blocks to near zero in both expert-interface and auto-generated interface modes, beating C2Rust and prior CRUST-Bench baselines. That combination of dynamic context collection, iterative verification, and dual interface handling is the concrete advance over static-context methods that often need manual fixes for cross-module links. The paper does a good job framing the scalability problem and reporting quantitative wins on established benchmarks rather than just small examples. The dual-mode evaluation is a useful addition that shows the approach can work with or without human-provided interfaces. The soft spot is the missing detail on whether the dependency graph and dynamic collection actually catch everything in complex cases. The abstract gives no coverage metrics, per-program failure breakdowns, or analysis of how often an include, type, or call site gets missed, which is exactly where prior static tools break. Without that, it's possible the high success rates still hide incomplete translations that happen to pass the available tests. The comparison methodology also lacks statistical significance or exact matching details. This work is for researchers focused on LLM agents for code migration and software security. Anyone tracking practical memory-safety tools will get value from the benchmark numbers and the orchestration design. It deserves a serious referee because the results are grounded in real benchmarks and the approach is a clear, testable step beyond the cited priors, even if the methods will need tighter validation for reproducibility.

Referee Report

2 major / 1 minor

Summary. The paper introduces ORBIT, a guided agentic orchestration framework for autonomous project-level C-to-Rust transpilation. It addresses challenges of limited context windows and hallucinations in LLM-based translation by combining dynamic context collection, a dependency-aware translation graph, specialized agents for interface generation and function mapping, and iterative verification. The evaluation on 24 programs from CRUST-Bench, where 91.7% exceed 1,000 lines of code, reports 100% compilation success and 91.7% test success in both expert-interface and automatically generated-interface settings, outperforming C2Rust and CRUST-Bench while reducing unsafe Rust code blocks to nearly zero. Competitive performance is also shown on challenging cases from the DARPA TRACTOR benchmark.

Significance. Should the quantitative results be confirmed with additional validation, this work would offer a meaningful contribution to automated software migration tools, particularly for improving memory safety in legacy C codebases. The agentic approach with dependency-guided orchestration appears to overcome limitations of prior static and out-of-the-box agent methods on larger, multi-module programs. The dual evaluation settings (expert and auto interfaces) and focus on minimizing unsafe code are positive aspects. The evaluation on predominantly large programs provides more realistic insight than prior work limited to small examples.

major comments (2)

[Evaluation] Evaluation section: The central claims of 100% compilation success and 91.7% test success on the 24 CRUST-Bench programs lack details on test coverage, failure modes, statistical significance, or the exact comparison methodology with C2Rust and CRUST-Bench baselines. This information is necessary to substantiate the outperformance claims.
[Evaluation] Evaluation section: No quantitative metrics (e.g., dependency coverage rates or per-program analysis of cross-module elements captured) are reported for the dynamic context collection and dependency-aware translation graph. This mechanism is load-bearing for the success claims, as incomplete context on complex dependencies could produce code that compiles and passes available tests without being fully correct or complete.

minor comments (1)

[Abstract] The abstract states that unsafe Rust code blocks are reduced to 'nearly zero' without providing exact counts, percentages, or comparison baselines; adding this precision would strengthen the presentation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of ORBIT's contributions to agentic C-to-Rust transpilation. We address each major comment below and will revise the manuscript accordingly to provide the requested details and strengthen the evaluation.

read point-by-point responses

Referee: Evaluation section: The central claims of 100% compilation success and 91.7% test success on the 24 CRUST-Bench programs lack details on test coverage, failure modes, statistical significance, or the exact comparison methodology with C2Rust and CRUST-Bench baselines. This information is necessary to substantiate the outperformance claims.

Authors: We agree that additional details are needed to fully substantiate the claims. In the revised manuscript, we will expand the Evaluation section to include: (1) available test coverage information from the CRUST-Bench suites; (2) analysis of failure modes for the two programs that did not achieve 100% test success (while noting 100% compilation success across all cases); (3) explicit statement that statistical significance testing is not applicable for this fixed benchmark of 24 programs, with per-program results provided instead; and (4) a precise description of the baseline comparison methodology, including how C2Rust and CRUST-Bench were executed and evaluated on the same programs and test suites. These changes will be made in the next version. revision: yes
Referee: Evaluation section: No quantitative metrics (e.g., dependency coverage rates or per-program analysis of cross-module elements captured) are reported for the dynamic context collection and dependency-aware translation graph. This mechanism is load-bearing for the success claims, as incomplete context on complex dependencies could produce code that compiles and passes available tests without being fully correct or complete.

Authors: We acknowledge that the manuscript does not currently report quantitative metrics for the dynamic context collection and dependency-aware translation graph, which is a valid point given the centrality of this mechanism. While the high success rates on predominantly large, multi-module programs provide indirect evidence of effective dependency handling, we agree that explicit metrics would better address concerns about potential incompleteness. In the revised version, we will add quantitative metrics including dependency coverage rates and a per-program breakdown of cross-module elements captured by the translation graph. revision: yes

Circularity Check

0 steps flagged

No circularity: evaluation uses independent external benchmarks

full rationale

The paper describes an agentic C-to-Rust translation system and reports empirical results on the external CRUST-Bench (24 programs) and DARPA TRACTOR benchmarks using standard, independently defined metrics of compilation success and test-pass rate. No equations, fitted parameters, or self-referential predictions appear in the provided text; success is not defined in terms of the method's own outputs or prior self-citations. The central claims therefore rest on external test suites rather than any reduction to the framework's internal definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated assumption that current LLMs, when orchestrated with dynamic navigation and verification loops, can overcome context and hallucination limits on real-scale codebases; no explicit free parameters or invented entities are named in the abstract.

axioms (1)

domain assumption LLM agents equipped with dynamic context collection and dependency-guided orchestration can produce complete and correct project-level translations without manual fixes
This premise underpins the entire agentic design and the claim of 100% compilation success on large programs.

pith-pipeline@v0.9.0 · 5571 in / 1289 out tokens · 66506 ms · 2026-05-10T15:14:37.257905+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 21 canonical work pages · 4 internal anchors

[1]

A proactive approach to more secure code

Gavin Thomas. A proactive approach to more secure code. Microsoft Security Response Center (MSRC) Blog, July 2019. Accessed: 2026-03-25

2019
[2]

Verify the safety of the Rust standard library

Aleksandar Zeljic, Shaobo Taneja, and Aaron Tomb. Verify the safety of the Rust standard library. AWS Open Source Blog, July 2022. Accessed: 2026-03-25

2022
[3]

Counterexamples in safe Rust

Muhammad Hassnain and Caleb Stanford. Counterexamples in safe Rust. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops, pages 128–135, 2024

2024
[4]

Cargo sherlock: An smt-based checker for software trust costs, 2026

Muhammad Hassnain, Anirudh Basu, Ethan Ng, and Caleb Stanford. Cargo sherlock: An smt-based checker for software trust costs, 2026

2026
[5]

TRACTOR: Translating All C to Rust

Defense Advanced Research Projects Agency (DARPA). TRACTOR: Translating All C to Rust. https://www.darpa.mil/research/programs/translating-all-c-to- rust, 2024. Accessed: 2026-03-20

2024
[6]

The great refactor: DARPA TRACTOR docu- mentation and resources

TRACTOR Program Developers. The great refactor: DARPA TRACTOR docu- mentation and resources. https://www.thegreatrefactor.org/, 2024. Accessed: 2026-03-25

2024
[7]

Immunant. C2Rust. https://github.com/immunant/c2rust, 2022. Accessed: [Insert Date Here]

2022
[8]

Ownership guided C to Rust translation

Hanliang Zhang, Cristina David, Yijun Yu, and Meng Wang. Ownership guided C to Rust translation. InInternational Conference on Computer Aided Verification, pages 459–482. Springer, 2023

2023
[9]

Concrat: An automatic C-to-Rust lock API translator for concurrent programs

Jaemin Hong and Sukyoung Ryu. Concrat: An automatic C-to-Rust lock API translator for concurrent programs. In2023 IEEE/ACM 45th International Confer- ence on Software Engineering (ICSE), pages 716–728. IEEE, 2023

2023
[10]

To tag, or not to tag: Translating C’s unions to Rust’s tagged unions

Jaemin Hong and Sukyoung Ryu. To tag, or not to tag: Translating C’s unions to Rust’s tagged unions. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, pages 40–52, 2024

2024
[11]

Translating C to safer Rust.Proc

Mehmet Emre, Ryan Schroeder, Kyle Dewey, and Ben Hardekopf. Translating C to safer Rust.Proc. ACM Program. Lang., 5(OOPSLA), oct 2021

2021
[12]

Exploring and unleashing the power of large language models in automated code translation.Proceedings of the ACM on Software Engineering, 1(FSE):1585–1608, 2024

Zhen Yang, Fang Liu, Zhongxing Yu, Jacky Wai Keung, Jia Li, Shuo Liu, Yifan Hong, Xiaoxue Ma, Zhi Jin, and Ge Li. Exploring and unleashing the power of large language models in automated code translation.Proceedings of the ACM on Software Engineering, 1(FSE):1585–1608, 2024

2024
[13]

Towards translat- ing real-world code with LLMs: A study of translating to Rust.arXiv preprint arXiv:2405.11514, 2024

Hasan Ferit Eniser, Hanliang Zhang, Cristina David, Meng Wang, Maria Chris- takis, Brandon Paulsen, Joey Dodds, and Daniel Kroening. Towards translat- ing real-world code with LLMs: A study of translating to Rust.arXiv preprint arXiv:2405.11514, 2024

work page arXiv 2024
[14]

Spectra: Enhancing the code translation ability of language models by generating multi-modal specifications

Vikram Nitin, Rahul Krishna, and Baishakhi Ray. Spectra: Enhancing the code translation ability of language models by generating multi-modal specifications. arXiv preprint arXiv:2405.18574, 2024

work page arXiv 2024
[15]

Vert: Verified equivalent rust transpilation with large language models as few-shot learners,

Aidan ZH Yang, Yoshiki Takashima, Brandon Paulsen, Josiah Dodds, and Daniel Kroening. Vert: Verified equivalent Rust transpilation with large language models as few-shot learners.arXiv preprint arXiv:2404.18852, 2024

work page arXiv 2024
[16]

Rustassure: Differential symbolic testing for llm- transpiled c-to-rust code.arXiv preprint arXiv:2510.07604, 2025

Yubo Bai and Tapti Palit. Rustassure: Differential symbolic testing for llm- transpiled c-to-rust code.arXiv preprint arXiv:2510.07604, 2025

work page arXiv 2025
[17]

C2SaferRust: Transforming C projects into safer Rust with neurosymbolic tech- niques.arXiv preprint arXiv:2501.14257, 2025

Vikram Nitin, Rahul Krishna, Luiz Lemos do Valle, and Baishakhi Ray. C2SaferRust: Transforming C projects into safer Rust with neurosymbolic tech- niques.arXiv preprint arXiv:2501.14257, 2025

work page arXiv 2025
[18]

arXiv:2412.14234 doi:10.48550/ ARXIV.2412.14234

Manish Shetty, Naman Jain, Adwait Godbole, Sanjit A Seshia, and Koushik Sen. Syzygy: Dual code-test C to (safe) Rust translation using LLMs and dynamic analysis.arXiv preprint arXiv:2412.14234, 2024

work page arXiv 2024
[20]

Scalable, validated code translation of entire projects using large language models.arXiv preprint arXiv:2412.08035, 2024

Hanliang Zhang, Cristina David, Meng Wang, Brandon Paulsen, and Daniel Kroening. Scalable, validated code translation of entire projects using large language models.arXiv preprint arXiv:2412.08035, 2024

work page arXiv 2024
[21]

Smartc2rust: Iterative, feedback-driven c-to-rust translation via large language models for safety and equivalence.arXiv preprint arXiv:2409.10506, 2024

Momoko Shiraishi, Yinzhi Cao, and Takahiro Shinagawa. Smartc2rust: Iterative, feedback-driven c-to-rust translation via large language models for safety and equivalence.arXiv preprint arXiv:2409.10506, 2024

work page arXiv 2024
[22]

Evoc2rust: A skeleton-guided framework for project-level c-to-rust translation, 2025

Chaofan Wang, Tingrui Yu, Chen Xie, Jie Wang, Dong Chen, Wenrui Zhang, Yul- ing Shi, Xiaodong Gu, and Beijun Shen. Evoc2rust: A skeleton-guided framework for project-level c-to-rust translation, 2025

2025
[23]

Rustmap: Towards project- scale c-to-rust migration via program analysis and llm

Xuemeng Cai, Jiakun Liu, Xiping Huang, Yijun Yu, Haitao Wu, Chunmiao Li, Bo Wang, Imam Nur Bani Yusuf, and Lingxiao Jiang. Rustmap: Towards project- scale c-to-rust migration via program analysis and llm. InInternational Conference on Engineering of Complex Computer Systems, pages 283–302. Springer, 2025

2025
[24]

Raw Pointer Rewriting with LLMs for Translating C to Safer Rust

Yifei Gao, Chengpeng Wang, Pengxiang Huang, Xuwei Liu, Mingwei Zheng, and Xiangyu Zhang. Pr2: Peephole raw pointer rewriting with llms for translating c to safer rust.arXiv preprint arXiv:2505.04852, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[25]

Type-migrating c-to-rust translation using a large language model.Empirical Software Engineering, 30(1):3, 2025

Jaemin Hong and Sukyoung Ryu. Type-migrating c-to-rust translation using a large language model.Empirical Software Engineering, 30(1):3, 2025

2025
[26]

Translating C to Rust: Lessons from a user study.arXiv preprint arXiv:2411.14174, 2024

Ruishi Li, Bo Wang, Tianyu Li, Prateek Saxena, and Ashish Kundu. Translating C to Rust: Lessons from a user study.arXiv preprint arXiv:2411.14174, 2024

work page arXiv 2024
[27]

Optimizing type migration for llm-based c-to-rust translation: A data flow graph approach

Qingxiao Xu and Jeff Huang. Optimizing type migration for llm-based c-to-rust translation: A data flow graph approach. InProceedings of the 14th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis, page 8–14, New York, NY, USA, 2025. Association for Computing Machinery

2025
[28]

Agentic Much? Adoption of Coding Agents on GitHub

Romain Robbes, Théo Matricon, Thomas Degueule, Andre Hora, and Stefano Zacchiroli. Agentic much? adoption of coding agents on github.arXiv preprint arXiv:2601.18341, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[29]

The rise of ai teammates in software engineering (se) 3.0: How autonomous coding agents are reshaping software engineering.arXiv preprint arXiv:2507.15003, 2025

Hao Li, Haoxiang Zhang, and Ahmed E Hassan. The rise of ai teammates in software engineering (se) 3.0: How autonomous coding agents are reshaping software engineering.arXiv preprint arXiv:2507.15003, 2025

work page arXiv 2025
[30]

Open-source llms for technical q&a: Lessons from stackexchange

Zeerak Babar, Nafiz Imtiaz Khan, Muhammad Hassnain, and Vladimir Filkov. Open-source llms for technical q&a: Lessons from stackexchange. InInterna- tional Conference on Software Engineering of Emerging Technology, pages 615–626. Springer, 2025

2025
[31]

AI coding boom shifts software developers toward man- agement.Business Insider, March 2026

Business Insider Staff. AI coding boom shifts software developers toward man- agement.Business Insider, March 2026. Accessed: 2026-03-25

2026
[32]

Crust-bench: A comprehensive benchmark for c-to-safe- rust transpilation.arXiv preprint arXiv:2504.15254, 2025

Anirudh Khatry, Robert Zhang, Jia Pan, Ziteng Wang, Qiaochu Chen, Greg Durrett, and Isil Dillig. Crust-bench: A comprehensive benchmark for c-to-safe- rust transpilation.arXiv preprint arXiv:2504.15254, 2025

work page arXiv 2025
[33]

https://github.com/tree-sitter/tree-sitter, 2023

Tree-sitter. https://github.com/tree-sitter/tree-sitter, 2023. Accessed: March 14, 2025

2023
[34]

Topological sorting of large networks.Communications of the ACM, 5(11):558–562, 1962

Arthur B Kahn. Topological sorting of large networks.Communications of the ACM, 5(11):558–562, 1962

1962
[35]

MetaGPT: Meta programming for a multi-agent collaborative framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen, et al. MetaGPT: Meta programming for a multi-agent collaborative framework. InInternational Conference on Learning Representations, 2024

2024
[36]

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Dong Huang, Qingwen Bu, Jie M. Zhang, Michael Luck, and Heming Cui. Agent- Coder: Multi-agent-based code generation with iterative testing and optimisation. arXiv preprint arXiv:2312.13010, 2023

work page internal anchor Pith review arXiv 2023
[37]

ChatDev: Communicative agents for software development

Chen Qian, Wei Liu, Hongzhang Liu, et al. ChatDev: Communicative agents for software development. InProceedings of the Annual Meeting of the Association for Computational Linguistics, 2024

2024
[38]

SafeTrans: LLM-assisted Transpilation from C to Rust

Muhammad Farrukh, Smeet Shah, Baris Coskun, and Michalis Polychron- akis. Safetrans: Llm-assisted transpilation from c to rust.arXiv preprint arXiv:2505.10708, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[39]

Opencode: The open-source AI coding agent

Anomaly Co. Opencode: The open-source AI coding agent. https://opencode.ai/,
[40]

Accessed: 2026-03-23

2026
[41]

Codex: An AI coding partner

OpenAI. Codex: An AI coding partner. https://openai.com/codex/, 2021. Accessed: 2026-03-23

2021
[42]

Free Software Foundation, 2024

Free Software Foundation.GNU C Compiler Extensions: Nested Functions. Free Software Foundation, 2024. Accessed: 2025

2024
[43]

Api pricing

OpenAI. Api pricing. https://developers.openai.com/api/docs/pricing, 2026. Accessed: 2026-03-18

2026
[44]

Amazon bedrock pricing

Amazon Web Services. Amazon bedrock pricing. https://aws.amazon.com/ bedrock/pricing/, 2026. Accessed: 2026-03-18

2026
[45]

Adversarial agent collaboration for c to rust translation.arXiv preprint 11 arXiv:2510.03879, 2025

Tianyu Li, Ruishi Li, Bo Wang, Brandon Paulsen, Umang Mathur, and Prateek Saxena. Adversarial agent collaboration for c to rust translation.arXiv preprint 11 arXiv:2510.03879, 2025

work page arXiv 2025
[46]

First TRACTOR Evaluation Report

DARPA TRACTOR Program. First TRACTOR Evaluation Report. Evaluation report, Defense Advanced Research Projects Agency (DARPA), 2024. Available via the official TRACTOR Program GitHub repository

2024
[47]

Llm-driven multi-step translation from c to rust using static analysis.arXiv preprint arXiv:2503.12511, 2025

Tianyang Zhou, Haowen Lin, Somesh Jha, Mihai Christodorescu, Kirill Levchenko, and Varun Chandrasekaran. Llm-driven multi-step translation from c to rust using static analysis.arXiv preprint arXiv:2503.12511, 2025

work page arXiv 2025
[48]

Don’t write, but return: Replacing output parameters with algebraic data types in C-to-Rust translation.Proceedings of the ACM on Programming Languages, 8(PLDI):716–740, 2024

Jaemin Hong and Sukyoung Ryu. Don’t write, but return: Replacing output parameters with algebraic data types in C-to-Rust translation.Proceedings of the ACM on Programming Languages, 8(PLDI):716–740, 2024

2024
[49]

Improving automatic C-to-Rust translation with static analysis

Jaemin Hong. Improving automatic C-to-Rust translation with static analysis. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering (ICSE), pages 273–277, 2023

2023
[50]

Large language models for code completion: A systematic literature review.Computer Standards & Interfaces, 92:103917, 2025

Rasha Ahmad Husein, Hala Aburajouh, and Cagatay Catal. Large language models for code completion: A systematic literature review.Computer Standards & Interfaces, 92:103917, 2025

2025
[51]

Physical Safety

HanXiang Xu, ShenAo Wang, Ningke Li, Kailong Wang, Yanjie Zhao, Kai Chen, Ting Yu, Yang Liu, and HaoYu Wang. Large language models for cyber security: A systematic literature review.arXiv preprint arXiv:2405.04760, 2024

work page arXiv 2024
[52]

Embedding large language models into extended reality: Op- portunities and challenges for inclusion, engagement, and privacy

Efe Bozkir, Süleyman Özdel, Ka Hei Carrie Lau, Mengdi Wang, Hong Gao, and Enkelejda Kasneci. Embedding large language models into extended reality: Op- portunities and challenges for inclusion, engagement, and privacy. InProceedings of the 6th ACM Conference on Conversational User Interfaces, pages 1–7, 2024

2024
[53]

Yoonsang Kim, Zainab Aamir, Mithilesh Singh, Saeed Boorboor, Klaus Mueller, and Arie E. Kaufman. Explainable XR: Understanding user behaviors of XR environments using LLM-assisted analytics framework.IEEE Transactions on Visualization and Computer Graphics, 2025

2025
[54]

Lowering barriers to cad adoption: A comparative study of augmented reality- based cad (ar-cad) and a traditional cad tool

Muhammad Talha, Abdullah Mohiuddin, Sehrish Javed, and Ahmed Qureshi. Lowering barriers to cad adoption: A comparative study of augmented reality- based cad (ar-cad) and a traditional cad tool. InInternational Design Engineering Technical Conferences and Computers and Information in Engineering Conference, volume 89206, page V02AT02A018. American Society ...

2025
[55]

Extending the cognitive domain of bloom’s taxonomy using machine learning.Research Square (Preprint), 2026

Muhammad Talha, Jingchuan Shi, and Ahmed Qureshi. Extending the cognitive domain of bloom’s taxonomy using machine learning.Research Square (Preprint), 2026

2026
[56]

Image, text, and speech data augmentation using multimodal LLMs for deep learning: A survey.arXiv preprint arXiv:2501.18648, 2025

Ranjan Sapkota, Shaina Raza, Maged Shoman, Achyut Paudel, and Manoj Karkee. Image, text, and speech data augmentation using multimodal LLMs for deep learning: A survey.arXiv preprint arXiv:2501.18648, 2025

work page arXiv 2025
[57]

The Poorest Man in Babylon: A Longitudinal Study of Cryptocur- rency Investment Scams

Muhammad Muzammil, Abisheka Pitumpe, Xigao Li, Amir Rahmati, and Nick Nikiforakis. The Poorest Man in Babylon: A Longitudinal Study of Cryptocur- rency Investment Scams. InProceedings of The Web Conference (WWW), 2025

2025
[58]

Large language models (LLMs) for source code analysis: applications, models and datasets.arXiv preprint arXiv:2503.17502, 2025

Hamed Jelodar, Mohammad Meymani, and Roozbeh Razavi-Far. Large language models (LLMs) for source code analysis: applications, models and datasets.arXiv preprint arXiv:2503.17502, 2025

work page arXiv 2025
[59]

Lost in translation: A study of bugs introduced by large language models while translating code

Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lam- bert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha, and Reyhaneh Jabbarvand. Lost in translation: A study of bugs introduced by large language models while translating code. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering, ...

2024
[60]

Alphatrans: A neuro- symbolic compositional approach for repository-level code translation and vali- dation.Proceedings of the ACM on Software Engineering, 2(FSE):2454–2476, 2025

Ali Reza Ibrahimzada, Kaiyao Ke, Mrigank Pawagi, Muhammad Salman Abid, Rangeet Pan, Saurabh Sinha, and Reyhaneh Jabbarvand. Alphatrans: A neuro- symbolic compositional approach for repository-level code translation and vali- dation.Proceedings of the ACM on Software Engineering, 2(FSE):2454–2476, 2025

2025
[61]

Large language model-powered agent for c to rust code trans- lation, 2025

HoHyun Sim, Hyeonjoong Cho, Yeonghyeon Go, Zhoulai Fu, Ali Shokri, and Binoy Ravindran. Large language model-powered agent for c to rust code trans- lation, 2025

2025
[62]

Matchfixagent: Language-agnostic autonomous repository-level code translation validation and repair.arXiv preprint arXiv:2509.16187, 2025

Ali Reza Ibrahimzada, Brandon Paulsen, Reyhaneh Jabbarvand, Joey Dodds, and Daniel Kroening. Matchfixagent: Language-agnostic autonomous repository- level code translation validation and repair.arXiv preprint arXiv:2509.16187, 2025

work page arXiv 2025
[63]

Rustify: Towards repository-level c to safer rust via workflow-guided multi-agent transpiler

Chen Wang, Yujun Huang, Peng Li, Lina Gong, and Fei Wu. Rustify: Towards repository-level c to safer rust via workflow-guided multi-agent transpiler. 12