arxiv: 2604.11027 · v2 · submitted 2026-04-13 · 💻 cs.PL

Recognition: unknown

Parameterized Algorithms and Complexity for Function Merging with Branch Reordering

Amir K. Goharshady , Kerim Kochekov , Tian Shu , Ahmed Khaled Zaher

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:10 UTC · model grok-4.3

classification 💻 cs.PL

keywords function mergingbranch reorderingparameterized complexityfixed-parameter tractabilityNP-hardnesscompiler optimizationcontrol-flow graphssequence alignment

0 comments

The pith

Function merging with branch reordering is fixed-parameter tractable when parameterized by branching factor and nesting depth.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the problem of merging two functions by reordering branches inside each to produce linear sequences of instructions that align as well as possible. It establishes that the problem is NP-hard in general but admits an algorithm running in time 2 to the O of bd times n squared, where b and d are the branching factor and nesting depth. A second algorithm runs in time 2 to the O of b times d2 times n to the seventh, showing tractability even when one function has unbounded nesting depth. These results identify the parameters that control the complexity and draw a boundary via a hardness proof for constant values of d1, b2, and d2.

Core claim

The input consists of two functions F1 and F2, each with size ni, branching factor bi, and nesting depth di. The task is to reorder branches within each function so that the resulting linearizations maximize sequence alignment score. The problem is fixed-parameter tractable in the four parameters b1, d1, b2, d2 via a 2^{O(bd)} n^2 algorithm, remains FPT when one nesting depth is unbounded via a 2^{O(b d2)} n^7 algorithm, and is NP-hard even when d1, b2, d2 are fixed constants.

What carries the argument

The pair of parameters consisting of branching factor b (maximum out-degree in the control-flow graph) and nesting depth d (maximum nesting level of branches), which together bound the number of distinct linearizations of each function.

If this is right

When functions have small branching and nesting, compilers can compute optimal merges in time practical for code-size reduction.
Tractability holds even if one input function has deep nesting, provided the other has bounded branching.
The hardness result shows that constant nesting depth and branching in one function plus constant branching in the other is already intractable.
The approach extends sequence-alignment methods by explicitly accounting for control-flow reordering.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same parameterization could be applied to merging more than two functions or to other code-reordering tasks such as instruction scheduling.
Practical implementations could combine the FPT algorithms with heuristics when bd grows moderately large.
The results suggest studying whether additional natural parameters such as the number of live variables further reduce the exponent.

Load-bearing premise

Reordering the branches inside each function produces a semantically equivalent program whose observable behavior is unchanged.

What would settle it

An explicit pair of small functions with bounded bd whose optimal merge cannot be found by any algorithm running in time 2^{o(bd)} poly(n), or a concrete example where branch reordering changes observable program output.

Figures

Figures reproduced from arXiv: 2604.11027 by Ahmed Khaled Zaher, Amir K. Goharshady, Kerim Kochekov, Tian Shu.

**Figure 1.** Figure 1: Two functions F1 (left) and F2 (right) to be merged. The shared code is highlighted in green, whereas code only in F1 or F2 is highlighted in red and blue, respectively. 1 def F★(x , b ) : 2 x += 1 3 if b == 1: 4 y = x * 75 5 z = -2 * x 6 if b == 1: 7 print ( y ) 8 else : 9 print ( x + z ) 10 if b == 2 and x == 1: 11 print ( z ) 12 print ( x * z ) 13 elif x == 0: 14 print ( x ) 15 for i in range (3) : 16 p… view at source ↗

**Figure 2.** Figure 2: Merged function F★ (left) and the updated bodies of F1 and F2 after the merge (right). this reason, all related works [22, 23, 27, 28] take a heuristic approach to pair selection by first identifying a seemingly promising pair of functions, and then delegating the extraction of as much profit from that pair to the merging routine. In this work, we only focus on the merging task, and whenever we speak of fu… view at source ↗

**Figure 3.** Figure 3: CFG of F1 from [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Merged function F ′ ★ obtained via the SA-based approach on F ′ 1 from [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Illustration of the inductive construction of branching graphs. Cases (1), (2), and (3) are represented [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: The branching graph corresponding to the program [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: An illustration of the shape of visited nodes [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

Binary size reduction is an increasingly important optimization objective for compilers. One emerging technique is function merging, where multiple similar functions are merged into one, thereby eliminating redundancy. The SOTA approach to perform the merging is based on sequence alignment, where functions are viewed as linear sequences of instructions that are then matched in a way maximizing their alignment. In this paper, we consider a significantly generalized formulation of the problem by allowing reordering of branches within each function, subsequently allowing for more flexible matching and better merging. We show that this makes the problem NP-hard, and thus we study it through the lens of parameterized algorithms and complexity, where we identify certain parameters of the input that govern its complexity. We look at two natural parameters: the branching factor and nesting depth of input functions. Concretely, our input consists of two functions $F_1, F_2,$ where each $F_i$ has size $n_i,$ branching factor $b_i,$ and nesting depth $d_i.$ Our task is to reorder the branches of $F_1$ and $F_2$ in a way that yields linearizations achieving the maximum sequence alignment. Let $n=\max(n_1, n_2),$ and define $b, d$ similarly. Our results are as follows: - A simple algorithm running in time $2^{O(bd)} n^2,$ establishing that the problem is fixed-parameter tractable (FPT) with respect to all four parameters $b_1,d_1, b_2, d_2.$ - An algorithm running in time $2^{O(bd_2)} n^7,$ showing that even when one of the functions has an unbounded nesting depth, the problem remains in FPT. - A hardness result showing that the problem is NP-hard even when constrained to constant $d_1, b_2, d_2.$ To the best of our knowledge, this is the first systematic study of function merging with branch reordering from an algorithmic or complexity-theoretic perspective.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper generalizes function merging to allow branch reordering inside functions and gives explicit FPT algorithms plus a hardness result for parameters based on branching factor and nesting depth.

read the letter

The key takeaway is that this work extends the usual sequence-alignment approach to function merging by permitting reordering of branches within each function before trying to align and merge them. That change makes the problem NP-hard, so the authors turn to parameterized complexity with branching factor b and nesting depth d as the natural parameters that control the search space over possible reorderings. They give a straightforward 2^{O(bd)} n^2 algorithm that is FPT in all four parameters b1, d1, b2, d2, plus a second algorithm that tolerates unbounded depth on one side at 2^{O(b d2)} n^7 time. They also prove NP-hardness even when d1, b2, and d2 are fixed constants. These results are new relative to the cited prior merging literature. The paper states the running times and the reduction clearly, and the parameter choices line up with what a dynamic program over reorderings would need. No circular definitions or fitted quantities appear. The main soft spot is that any real compiler use still requires checking that the chosen reordering preserves observable behavior; the paper treats this as outside its algorithmic scope, which is reasonable but leaves the practical bridge for follow-up work. The n^7 term is high but acceptable in a theory paper. This is aimed at researchers who work on compiler optimizations through the lens of parameterized algorithms. Someone focused purely on practical binary-size wins might find it too abstract, while a reader interested in FPT techniques for alignment or tree problems could pick up usable DP ideas. I would send it for peer review. The claims are focused and the analysis looks solid enough to benefit from referee input.

Referee Report

0 major / 3 minor

Summary. The manuscript generalizes the function merging problem to allow reordering of branches within each of two input functions F1 and F2 (with sizes n1, n2, branching factors b1, b2, and nesting depths d1, d2) in order to maximize the score of a subsequent sequence alignment. It presents an FPT algorithm running in time 2^{O(bd)} n^2 (where b and d are the maxima of the respective parameters), a second algorithm running in time 2^{O(b d2)} n^7 that remains FPT even when d1 is unbounded, and an NP-hardness result that holds even when d1, b2, and d2 are fixed constants.

Significance. If the stated running times and hardness reduction are correct, the work supplies the first parameterized-complexity treatment of branch-reordering function merging. The parameters b_i and d_i are natural, small in practice, and directly bound the state space of any dynamic program over reorderings; the explicit FPT bounds and the separation between the fully bounded case and the one-sided unbounded-depth case therefore give concrete guidance on when the optimization is tractable. The hardness result usefully delineates the boundary of tractability.

minor comments (3)

The abstract states that n = max(n1, n2) and b, d are defined similarly, but the manuscript should explicitly confirm whether the O(bd) term uses the maxima or the sums of the individual parameters, as this affects the precise FPT claim.
The NP-hardness statement leaves b1 unbounded; the manuscript should add a sentence clarifying that this is the intended source of hardness and that the reduction does not rely on large b1.
Notation for the two functions and their linearizations after reordering should be introduced once in a dedicated preliminary section and used consistently thereafter.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive evaluation of our manuscript and the recommendation for minor revision. The report provides a clear summary of our contributions but does not list any specific major comments requiring response. We are pleased that the parameterized complexity results, including the FPT algorithms and the hardness result, are viewed as providing useful guidance on the tractability of function merging with branch reordering.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's core results consist of explicit algorithmic constructions (a DP-based FPT algorithm running in 2^{O(bd)} n^2 time and a variant in 2^{O(b d2)} n^7) together with a standard NP-hardness reduction for the remaining parameter. These are derived directly from the definitions of branching factor b_i and nesting depth d_i as natural bounds on the state space of reorderings and alignments; no quantities are defined in terms of themselves, no fitted parameters are relabeled as predictions, and no load-bearing claims rest on self-citations or imported uniqueness theorems. The semantic assumption that branch reordering preserves observable behavior is stated as part of the problem formulation rather than derived from the algorithms themselves. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper introduces no free parameters, invented entities, or non-standard axioms; it relies on the standard definitions of FPT, NP-hardness, and the natural parameters b and d extracted from the control-flow structure of the input functions.

axioms (1)

standard math Standard definitions and results from parameterized complexity theory (FPT membership via explicit algorithms, NP-hardness via reduction).
The claims rest on the established framework of parameterized complexity rather than any novel axioms.

pith-pipeline@v0.9.0 · 5695 in / 1348 out tokens · 60934 ms · 2026-05-10T16:10:06.947213+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references

[1]

Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. 2015. Tight Hardness Results for LCS and Other Sequence Similarity Measures. InFOCS. IEEE Computer Society, 59–78

2015
[2]

Frances E. Allen. 1970. Control flow analysis. InSymposium on Compiler Optimization. ACM, 1–19

1970
[3]

Karl Bringmann and Marvin Künnemann. 2015. Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping. InFOCS. IEEE Computer Society, 79–97

2015
[4]

Milind Chabbi, Jin Lin, and Raj Barik. 2021. An Experience with Code-Size Optimization for Production iOS Mobile Applications. InCGO. IEEE, 363–377

2021
[5]

Cormen, Charles E

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009.Introduction to Algorithms, 3rd Edition. MIT Press

2009
[6]

Bruno Coutinho, Diogo Sampaio, Fernando Magno Quintão Pereira, and Wagner Meira Jr. 2011. Divergence Analysis and Optimizations. InPACT. IEEE Computer Society, 320–329

2011
[7]

Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh

Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. 2015.Parameterized Algorithms. Springer

2015
[8]

Downey and Michael R

Rodney G. Downey and Michael R. Fellows. 1999.Parameterized Complexity. Springer

1999
[9]

Enming Fan, Xiaofeng Guan, Fan Hu, Heng Shi, Hao Zhou, and Jianguo Yao. 2025. Postiz: Extending Post-increment Addressing for Loop Optimization and Code Size Reduction. InCGO. ACM, 600–613

2025
[10]

M. R. Garey and David S. Johnson. 1979.Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman

1979
[11]

Tianao Ge, Zewei Mo, Kan Wu, Xianwei Zhang, and Yutong Lu. 2022. RollBin: reducing code-size via loop rerolling at binary level. InLCTES. ACM, 99–110

2022
[12]

2025.Android Developers - Reduce your app size

Google. 2025.Android Developers - Reduce your app size. Technical Report. https://developer.android.com/topic/ performance/reduce-apk-size

2025
[13]

Ellis Hoag, Kyungwoo Lee, Julián Mestre, Sergey Pupyrev, and Yongkang Zhu. 2024. Reordering Functions in Mobiles Apps for Reduced Size and Faster Start-Up.ACM Trans. Embed. Comput. Syst.23, 4 (2024), 54:1–54:54

2024
[14]

2004.An introduction to bioinformatics algorithms

Neil C Jones and Pavel A Pevzner. 2004.An introduction to bioinformatics algorithms. MIT press

2004
[15]

Kyungwoo Lee, Ellis Hoag, and Nikolai Tillmann. 2022. Efficient profile-guided size optimization for native mobile applications. InCC. ACM, 243–253

2022
[16]

Kyungwoo Lee, Manman Ren, and Ellis Hoag. 2024. Optimistic and Scalable Global Function Merging. InLCTES. ACM, 46–57

2024
[17]

Kyungwoo Lee, Manman Ren, and Shane Nay. 2022. Scalable size inliner for mobile applications (WIP). InLCTES. ACM, 116–120

2022
[18]

Gai Liu, Umar Farooq, Chengyan Zhao, Xia Liu, and Nian Sun. 2023. Linker Code Size Optimization for Native Mobile Applications. InCC. ACM, 168–179

2023
[19]

Reps, Stefan Schwoon, Somesh Jha, and David Melski

Thomas W. Reps, Stefan Schwoon, Somesh Jha, and David Melski. 2005. Weighted pushdown systems and their application to interprocedural dataflow analysis.Sci. Comput. Program.58, 1-2 (2005), 206–263

2005
[20]

Rodrigo C. O. Rocha, Pavlos Petoumenos, Björn Franke, Pramod Bhatotia, and Michael F. P. O’Boyle. 2022. Loop Rolling for Code Size Reduction. InCGO. IEEE, 217–229

2022
[21]

Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, Kim M. Hazelwood, and Hugh Leather. 2021. HyFM: function merging for free. InLCTES. ACM, 110–121

2021
[22]

Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, and Hugh Leather. 2019. Function Merging by Sequence Alignment. InCGO. IEEE, 149–163

2019
[23]

Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, and Hugh Leather. 2020. Effective function merging in the SSA form. InPLDI. ACM, 854–868

2020
[24]

Rodrigo C. O. Rocha, Charitha Saumya, Kirshanthan Sundararajah, Pavlos Petoumenos, Milind Kulkarni, and Michael F. P. O’Boyle. 2023. HyBF: A Hybrid Branch Fusion Strategy for Code Size Reduction. InCC. ACM, 156–167

2023
[25]

Yuta Saito, Kazunori Sakamoto, Hironori Washizaki, and Yoshiaki Fukazawa. 2025. Multiple Function Merging for Code Size Reduction.ACM Trans. Archit. Code Optim.22, 1 (2025), 7:1–7:26

2025
[26]

Anuj Sehgal, Vladislav Perelman, Siarhei Kuryla, and Jürgen Schönwälder. 2012. Management of resource constrained devices in the internet of things.IEEE Commun. Mag.50, 12 (2012), 144–149

2012
[27]

Sean Stirling, Rodrigo C. O. Rocha, Kim M. Hazelwood, Hugh Leather, Michael F. P. O’Boyle, and Pavlos Petoumenos
[28]

F3M: Fast Focused Function Merging. InCGO. IEEE, 242–253. Proc. ACM Program. Lang., Vol. 10, No. PLDI, Article 204. Publication date: June 2026. 204:24 Goharshady et al

2026
[29]

Tobias J. K. Edler von Koch, Björn Franke, Pranav Bhandarkar, and Anshuman Dasgupta. 2014. Exploiting function similarity for code size reduction. InLCTES. ACM, 85–94. A Appendix A.1 Proof of Lemma 3.2 Lemma 3.2.Given two programs 𝑃1, 𝑃2 ∈Prog(Σ) and a scoring function 𝛿, Algorithm 2 returns FMBR(𝑃 1, 𝑃2, 𝛿). Proof. We first give a quick refresher on the ...

2014