Recognition: unknown
SoCal: A Language for Memory-Layout Factorization of Recursive Datatypes
Pith reviewed 2026-05-09 14:13 UTC · model grok-4.3
The pith
A new language and compiler automatically convert recursive datatypes into multi-buffer layouts that separate fields for better memory performance in tree code.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce factored, multi-buffer layouts that store different ADT fields in separate buffers, enabling SoA-like layouts for serialized recursive data structures. We formalize this approach in SoCal, a language for generating factored ADT representations, and implement it in a compiler called Colobus. Colobus automatically transforms functional programs to operate over a serialized, factored layout of recursive ADTs. Our evaluation shows a 1.46x geometric mean speedup on a suite of tree-processing benchmarks.
What carries the argument
Factored multi-buffer layouts, in which each ADT field is stored in its own contiguous buffer with linking information to preserve recursive structure across buffers.
If this is right
- Existing functional programs on recursive datatypes can run faster after automatic conversion without source changes.
- Serialized trees gain the locality and parallel-access benefits previously available only to regular arrays.
- Applications such as AST manipulation, DOM processing, and k-d tree workloads become candidates for the same gains.
- The approach extends the classic AoS-to-SoA transformation to irregular, pointer-based data.
Where Pith is reading between the lines
- The technique could be combined with existing vectorization passes to exploit the independent buffers for SIMD execution.
- Memory-footprint trade-offs from extra buffers and linking data might limit applicability in memory-constrained settings.
- Similar factorization might be applied to other irregular structures if a serialization step is first defined.
Load-bearing premise
The automatic program transformation preserves the original semantics exactly for every possible input.
What would settle it
A test case in which the transformed program produces different results from the original program on the same inputs, or a controlled benchmark run that isolates the layout change and shows no performance difference.
Figures
read the original abstract
Array-of-structures (AoS) to structure-of-arrays (SoA) is a classic compiler transformation that improves memory locality and enables data-parallel execution. Existing AoS-to-SoA transformations primarily target regular, array-based programs in imperative languages like C and C++. In contrast, many applications manipulate tree-shaped data structures, for example, ASTs in compilers, DOM trees in browsers, and k-d trees in scientific workloads. Prior work improves the performance of functional programs operating on such data by serializing algebraic datatypes (ADTs) into contiguous memory buffers. However, these representations interleave fields within a single buffer, similar to AoS layouts. We introduce factored, multi-buffer layouts that store different ADT fields in separate buffers, enabling SoA-like layouts for serialized recursive data structures. We formalize this approach in SoCal, a language for generating factored ADT representations, and implement it in a compiler called Colobus. Colobus automatically transforms functional programs to operate over a serialized, factored layout of recursive ADTs. Our evaluation shows a 1.46x geometric mean speedup on a suite of tree-processing benchmarks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SoCal, a language for memory-layout factorization of recursive algebraic datatypes (ADTs) that enables factored multi-buffer representations analogous to structure-of-arrays (SoA) layouts. It describes the Colobus compiler, which performs an automatic semantics-preserving program transformation to operate over these serialized factored layouts, and evaluates the approach on tree-processing benchmarks, reporting a 1.46x geometric mean speedup.
Significance. If the central claims hold, this work meaningfully extends classic AoS-to-SoA transformations from imperative array-based programs to recursive ADTs common in functional languages. The formalization of factored layouts in SoCal and the automatic rewrite in Colobus provide a principled foundation that could improve locality and enable data-parallelism for applications such as AST processing and scientific tree structures. The evaluation, if substantiated, demonstrates practical performance gains.
major comments (1)
- [Evaluation] Evaluation section: The reported 1.46x geometric mean speedup is central to the performance claim, yet the manuscript supplies no information on benchmark selection criteria, measurement methodology (including hardware, number of runs, and timing), error bars, or comparisons against strong baselines such as existing ADT serialization techniques or hand-tuned layouts. This information is required to attribute observed speedups specifically to the factored multi-buffer layout rather than incidental effects.
minor comments (2)
- [§3] §3 (SoCal Language): The presentation of layout rules would benefit from a concrete running example showing how a small recursive ADT (e.g., a binary tree) is mapped to multiple buffers, to clarify the factorization mechanics for readers.
- [Related Work] Related Work: The discussion of prior ADT serialization work could be expanded with explicit comparisons to recent efforts on memory layouts in functional compilers, to better position the novelty of the multi-buffer approach.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the potential of SoCal and Colobus to extend AoS-to-SoA ideas to recursive ADTs. We agree that the evaluation requires more methodological detail to support the reported speedups and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: The reported 1.46x geometric mean speedup is central to the performance claim, yet the manuscript supplies no information on benchmark selection criteria, measurement methodology (including hardware, number of runs, and timing), error bars, or comparisons against strong baselines such as existing ADT serialization techniques or hand-tuned layouts. This information is required to attribute observed speedups specifically to the factored multi-buffer layout rather than incidental effects.
Authors: We acknowledge that the current manuscript omits these experimental details. In the revised version we will expand the Evaluation section with: (1) explicit criteria used to select the tree-processing benchmarks, (2) hardware platform specifications, (3) number of runs, timing methodology, and statistical reporting (including error bars), and (4) direct comparisons against standard ADT serialization baselines and, where feasible, hand-tuned layouts. These additions will clarify that the observed gains are attributable to the factored multi-buffer representation. revision: yes
Circularity Check
No significant circularity
full rationale
The paper introduces SoCal as a language for defining factored multi-buffer layouts for recursive ADTs and Colobus as the implementing compiler that performs automatic program transformations. The central results are the language definition itself (which supplies the layout rules by construction) and measured wall-clock speedups on tree-processing benchmarks. No equations, fitted parameters, or first-principles derivations are present that could reduce to their own inputs. The evaluation relies on external benchmarks rather than any self-referential prediction or self-citation chain. The approach is self-contained: the formalization defines the layouts, the compiler applies the corresponding rewrites, and performance is reported empirically. This matches the common case of a systems paper whose claims rest on implementation and measurement rather than any load-bearing circular step.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Functional programs on recursive ADTs can be automatically transformed to equivalent programs on serialized factored layouts without changing observable behavior.
invented entities (2)
-
SoCal language
no independent evidence
-
Colobus compiler
no independent evidence
Reference graph
Works this paper leans on
-
[1]
ISBN 978-0412048418. Trishul M. Chilimbi, Bob Davidson, and James R. Larus. Cache-conscious structure definition. InProceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, PLDI ’99, pages 13–24, New York, NY, USA, 1999. ACM. ISBN 1-58113-094-5. doi: http://doi.acm.org/10.1145/301618.301635. URL http://doi.acm.org/...
-
[2]
related passes. We use an adapted ADT for the Octree and 2 passes over the octree (palet- teEntriesQuantizedandquantizationErrorProxy) that estimates the number of palettes entries after quantization and the second estimates the quantization error after quantization of the image. Results are reported in Table 27. A.11 PiecewiseFunctions ThePiecewiseFuncti...
-
[3]
If(l r ↦→𝜏) ∈Σ, then there existi 𝑠,i 𝑒 such that (lr ↦→ ⟨r,i 𝑠 ⟩) ∈Mand𝜏;⟨r,i 𝑠 ⟩;S⊢ ew ⟨r,i 𝑒 ⟩. SoCal: A Language for Memory-Layout Factorization of Recursive Datatypes 41 [T -V ar] Γ(𝑥)=𝜏@𝑙𝑜𝑐Σ(𝑙𝑜𝑐)=𝜏 Γ;Σ;C;A;N⊢A;N;𝑥:𝜏@𝑙𝑜𝑐 [T -Concrete-Loc] Σ(𝑙𝑜𝑐)=𝜏 Γ;Σ;C;A;N⊢A;N;𝑐𝑙𝑜𝑐:𝜏@𝑙𝑜𝑐 [T -LetLoc-ProjTag] 𝑙𝑑 𝑟𝑑 , 𝑙𝑜𝑐1 ≠𝑙𝑜𝑐 2 A(r 𝑑 )=∅𝑙𝑜𝑐 1 ∈N𝑙𝑜𝑐 1, 𝑙𝑑 𝑟𝑑 ∉N ′′ Γ;Σ...
-
[4]
If (𝑙𝑜𝑐 1 ↦→𝜏) ∈ Σand 𝑙𝑜𝑐 1 =𝑙 𝑑 𝑟𝑑 − − − − − − − −⇀(K,j, 𝑙𝑜𝑐 j), then there exist corresponding concrete components⟨𝑟 𝑑,i 𝑠 ⟩𝑙𝑑 − − − − − − − − − −⇀(K,j, 𝑐𝑙𝑜𝑐 𝑠j )and⟨𝑟 𝑑,i 𝑒 ⟩𝑙𝑑 − − − − − − − − − −⇀(K,j, 𝑐𝑙𝑜𝑐 𝑒j )such that (𝑙𝑜𝑐 1 ↦→ ⟨𝑟 𝑑,i 𝑠 ⟩𝑙𝑑 − − − − − − − − − −⇀(K,j, 𝑐𝑙𝑜𝑐 𝑠j )) ∈M and 𝜏;⟨𝑟 𝑑,i 𝑠 ⟩𝑙𝑑 − − − − − − − − − −⇀(K,j, 𝑐𝑙𝑜𝑐 𝑠j );S⊢ ewrec ⟨𝑟𝑑,i...
-
[5]
42 Singhal et al
The constraint environment is realized by the runtime configuration: C⊢ 𝑤 𝑓𝑐 𝑓 𝑐 M;S. 42 Singhal et al. [T -Int] Γ;Σ;C;A;N⊢A;N;𝑛:𝜏@𝑙𝑜𝑐 [T -App] | − −⇀𝑙𝑜𝑐 ′ |=| − − −⇀𝑙𝑜𝑐 ′′′ |𝑛=| −⇀v|=| −⇀𝑥|𝑙𝑜𝑐 𝑜𝑢𝑡 ∈N Γ;Σ;C;A;N⊢A;N; −⇀v𝑖 : − − − − −⇀𝜏𝑖 @𝑙𝑜𝑐𝑖 A(𝑟𝑒𝑔)=𝑙𝑜𝑐 𝑜𝑢𝑡 ∀𝑖 .∃ 𝑗 .( − − −⇀𝑙𝑜𝑐 ′′′ 𝑖 = − − −⇀𝑙𝑜𝑐 ′′ 𝑗 ∧ − −⇀𝑙𝑜𝑐 𝑖 = − −⇀𝑙𝑜𝑐 ′ 𝑗 ) ∃𝑘 .(𝑙𝑜𝑐 ′′′ 𝑜𝑢𝑡 = − − −⇀𝑙𝑜𝑐...
-
[6]
The allocation and nursery environments are realized by the runtime configuration: A;N⊢ 𝑤 𝑓𝑐𝑎 M;S
-
[7]
Typed roots and nursery locations are disjoint: 𝑑𝑜𝑚(Σ) ∩N=∅. SoCal: A Language for Memory-Layout Factorization of Recursive Datatypes 43 [D-Let-Expr] S;M;e 1 ↩→S ′;M ′;e ′ 1 e1 ≠v S;M;let𝑥: ˆ𝜏=e 1 ine 2 ↩→S ′;M ′;let𝑥: ˆ𝜏=e ′ 1 ine 2 [D-Let-V al] S;M;let𝑥: ˆ𝜏=v 1 ine 2 ↩→S;M;e 2 [v1/𝑥] [D-App] S;M;𝑓[ −⇀𝑙𝑜𝑐] −⇀v↩→S;M;e[ −⇀v/ −⇀𝑥] [ −⇀𝑙𝑜𝑐/ − −⇀𝑙𝑜𝑐 ′ ] where...
-
[8]
If𝜏=IntandS(r) (i 𝑠 )=n, then Int;⟨r,i 𝑠 ⟩;S⊢ ew ⟨r,i 𝑠 +1⟩
-
[9]
|K ′ −⇀𝜏 ′ |
IfS(r) (i 𝑠 )=K ′ and data𝜏= − − − − −⇀ K1 −⇀𝜏 1 |. . .|K ′ −⇀𝜏 ′ |. . .| − − − − − −⇀ K𝑚 −⇀𝜏 𝑚, then the tag at⟨r,i 𝑠 ⟩is a valid start for a value of type𝜏
-
[10]
For constructor fields −⇀𝜏 ′ 1,
Let𝑤 0 =i 𝑠 +1. For constructor fields −⇀𝜏 ′ 1, . . . , −⇀𝜏 ′ 𝑛, require −⇀𝜏 ′ 1;⟨r, 𝑤 0⟩;S⊢ ew ⟨r, 𝑤 1⟩ and − −⇀𝜏 ′ 𝑗+1 ;⟨r, 𝑤 𝑗 ⟩;S⊢ ew ⟨r, 𝑤 𝑗+1 ⟩𝑗∈ {1, . . . , 𝑛−1}
-
[11]
44 Singhal et al
The overall end witness is the end of the final field, or the cell one past the tag if the constructor has no fields. 44 Singhal et al. Recursive witness relation for factored roots.The recursive witness relation captures the additional invariants needed for factored roots in factored layouts
-
[12]
The data-constructor component of the factored root is present at the tag buffer location: S(𝑟 𝑑 ) (i𝑠 )=K ′ for some constructorK ′ of type𝜏
-
[13]
Scalar fields and datatypes laid out in an flattened form use the contiguous witness relation, while nested factored fields may in turn require the recursive witness relation
For each non-self-recursive field entry (K ′, j, 𝑐𝑙𝑜𝑐𝑠j ) in the component structure, the pro- jected concrete field component is well typed for the corresponding field type. Scalar fields and datatypes laid out in an flattened form use the contiguous witness relation, while nested factored fields may in turn require the recursive witness relation
-
[14]
Self-recursive fields omitted from the component structure are recovered by the recursive structure of the layout. In the preorder fragment used in the main paper, these descendants are reached by following the tag component and the sequence of factored (after·) con- straints generated by T-DataConstructor-FullyFactored. For instance, the first self recur...
-
[15]
In particular, the tag component yields the next constructor position in its buffer, and the field components provide the corresponding end witnesses for the non-tag buffers
The resulting end witness is the componentwise end of the entire rooted value. In particular, the tag component yields the next constructor position in its buffer, and the field components provide the corresponding end witnesses for the non-tag buffers. D.3 Well-formedness of constructor progress constraints Judgement form. C⊢ 𝑤 𝑓𝑐 𝑓 𝑐 M;S This judgement ...
-
[16]
If(𝑙𝑜𝑐↦→ (startr)) ∈C, thenM(𝑙𝑜𝑐)=ZeroIdx(𝑙𝑜𝑐)
-
[17]
If(𝑙𝑜𝑐↦→𝑙𝑜𝑐 ′ +1) ∈CandM(𝑙𝑜𝑐 ′)=⟨r,i⟩, thenM(𝑙𝑜𝑐)=⟨r,i+1⟩
-
[18]
If(𝑙𝑜𝑐↦→ (after𝜏@𝑙𝑜𝑐 ′)) ∈CandM(𝑙𝑜𝑐 ′)=𝑐𝑙𝑜𝑐 ′, then there exists𝑐𝑙𝑜𝑐 ′′ such that 𝜏;𝑐𝑙𝑜𝑐 ′;S⊢ ew 𝑐𝑙𝑜𝑐 ′′ andM(𝑙𝑜𝑐)=𝑐𝑙𝑜𝑐 ′′
-
[19]
If (𝑙𝑑 𝑟𝑑 ↦→ (projTagLoc𝑙𝑜𝑐 1)) ∈ CandM (𝑙𝑜𝑐 1) is a factored concrete root, then its tag component is the concrete location bound to𝑙 𝑑 𝑟𝑑
-
[20]
If (𝑙𝑜𝑐 𝑖 ↦→ (projFieldLoc( K, i)𝑙𝑜𝑐 1)) ∈ CandM (𝑙𝑜𝑐 1) is a factored concrete root whose component structure contains(K,i, 𝑐𝑙𝑜𝑐 𝑖 ), thenM(𝑙𝑜𝑐 𝑖 )=𝑐𝑙𝑜𝑐 𝑖
-
[21]
D.4 Well-formedness of allocation Judgement form
If (𝑙𝑜𝑐↦→ (introLocVec𝑙 𝑑 𝑟𝑑 𝑓 𝑙𝑑𝑠)) ∈ C, thenM (𝑙𝑜𝑐) is the factored concrete root obtained by combining the concrete tag location for 𝑙𝑑 𝑟𝑑 with the concrete field locations denoted by the entries of𝑓 𝑙𝑑𝑠. D.4 Well-formedness of allocation Judgement form. A;N⊢ 𝑤 𝑓𝑐𝑎 M;S This judgement ensures that allocation remains linear and that no store cell is overwritten
-
[22]
SoCal: A Language for Memory-Layout Factorization of Recursive Datatypes 45
If 𝑙𝑜𝑐∈ N, then 𝑙𝑜𝑐 is mapped byM, but the concrete store cell(s) denoted by that location have not yet been written inS. SoCal: A Language for Memory-Layout Factorization of Recursive Datatypes 45
-
[23]
For a factored location with multiple buffers, the concrete loc will have the highest index across all buffers
If (r ↦→𝑙𝑜𝑐) ∈ Aand 𝑙𝑜𝑐∈ N, then the concrete location denoted by 𝑙𝑜𝑐 lies strictly past the highest written cell in regionr. For a factored location with multiple buffers, the concrete loc will have the highest index across all buffers
-
[24]
If (r ↦→𝑙𝑜𝑐) ∈ Aand 𝑙𝑜𝑐∉ N, then the end witness of the value rooted at 𝑙𝑜𝑐 lies at or past the current allocation frontier for regionr
-
[25]
If(r↦→ ∅) ∈A, then regionrhas no written cells inS
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.