pith. sign in

arxiv: 2605.17884 · v1 · pith:TTV3RHUYnew · submitted 2026-05-18 · 💻 cs.PL · cs.MS· math.OC

Optimizing Optimizations, Declaratively: Optimizing the Higher-Order Functions in Mathematical Optimization with egglog

Pith reviewed 2026-05-20 00:56 UTC · model grok-4.3

classification 💻 cs.PL cs.MSmath.OC
keywords egglogequality saturationhigher-order functionsmathematical optimizationLaTeX outputconstraint detectiondatalog rulesHenkin constants
0
0 comments X

The pith

Egglog enables ensugaring of higher-order terms for natural LaTeX output and declarative multi-step constraint detection in mathematical optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies egglog to a lambda-calculus based mathematical modeling system to handle higher-order functions. It reconstructs natural comprehension syntax from desugared stream operations such as map, flat_map, and filter by using equality saturation with a custom cost model that minimizes temporary variable rebindings. It also encodes constraint detection logic as datalog-style rules that operate over multiple steps, using Henkin-like constants to parametrize constraints and propagate side conditions on subterms and indices. The same ensugaring step reduces large domain-set conditions that previously caused long runtimes or nontermination. A sympathetic reader would care because these techniques automate output formatting and speed up analysis without manual code or external orchestration.

Core claim

Egglog supports two industrial applications in JijModeling 2: first, reconstructing comprehension syntax from higher-order terms via equality saturation to produce natural LaTeX while minimizing rebindings; second, expressing multi-step constraint detection declaratively with datalog rules and Henkin-like constants that propagate side conditions, turning problematic large domain-set cases from minutes or nontermination into a few seconds.

What carries the argument

Equality saturation with a custom cost model for ensugaring higher-order lambda terms, combined with datalog-style rules and Henkin-like constants for declarative constraint detection and side-condition propagation.

If this is right

  • Natural LaTeX output can be generated automatically from desugared stream operations without losing the internal representation benefits.
  • Multi-step constraint detection logic can be written directly as datalog rules without Rust orchestration code.
  • Large domain-set conditions that previously caused slow or nonterminating runs can be reduced to seconds via the ensugaring procedure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The declarative rule style might simplify maintenance when adding new constraint types to the modeling system.
  • Similar ensugaring could be tested on other output targets such as code generation or visualization formats beyond LaTeX.

Load-bearing premise

The custom cost model and side-condition propagation rules produce terminating, efficient rewrites on real models without external orchestration or nontermination.

What would settle it

A model with higher-order functions and large domain sets on which the ensugaring produces unnatural LaTeX, the constraint detection takes minutes instead of seconds, or the rewrites fail to terminate would show the approach does not deliver practical performance.

Figures

Figures reproduced from arXiv: 2605.17884 by Hiromi Ishii.

Figure 1
Figure 1. Figure 1: shows a quadratic formulation of the Travelling Salesman Problem (TSP) as a typical example. The standard workflow is to formulate the mathematical expressions in a modeller and pass them to a solver. JijModeling is such a modeller, provided as a Python EDSL and implemented in Rust. It stores mathematical expressions as abstract syntax trees (ASTs) and provides features based on symbolic analysis of those … view at source ↗
Figure 2
Figure 2. Figure 2: Syntax of JijModeling 2 (excerpt) 1 @jm.Problem.define("TSP, Decorated", sense=jm.ProblemSense.MINIMIZE) 2 def tsp(problem: jm.DecoratedProblem): 3 C = problem.CategoryLabel(description="Labels of Cities") 4 N = C.count() 5 x = problem.BinaryVar(dict_keys=(N, C), 6 description="$x_{t,i} = 1$ if City $i$ is visited at time $t$") 7 d = problem.Float( 8 dict_keys=(C, C), description="distance between cities")… view at source ↗
Figure 3
Figure 3. Figure 3: Comprehension Syntax in JijModeling 2 jm.sum( N.filter(lambda i: i % 2 == 0) .flat_map(lambda i: M.map(lambda j: (i, j))) .map(lambda i, j: x[i] * t[i, j])) The problem is that, If such a desugared expression is emitted directly as LATEX, it is not so readable: ∑︁ map 𝜆(𝑖, 𝑗). 𝑥𝑖𝑡𝑖,𝑗, flat_map 𝜆𝑖. map (𝜆𝑗. (𝑖, 𝑗), 𝑀) , filter (𝜆𝑖. 𝑖 mod 2 = 0, 𝑁)   (★) For users, the conventional mathematical notation i… view at source ↗
Figure 4
Figure 4. Figure 4: Ensugaring rules for comprehension syntax. Here, [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
read the original abstract

We present two applications of egglog to mathematical optimization in JijModeling 2, a mathematical modeller whose internal representation is based on simply typed $\lambda$-calculus. First, we use egglog to improve $\LaTeX$ output for mathematical models expressed with higher-order functions. Python comprehensions are desugared into stream operations such as $\textsf{map}$, $\textsf{flat_map}$, and $\textsf{filter}$; emitting these terms directly produces unnatural mathematical notation. We reconstruct comprehension syntax by \emph{ensugaring} higher-order terms and use equality saturation with a custom cost model to minimize temporary variable rebindings. Second, we use egglog as a declarative engine for \emph{constraint detection}, extending the previous egg-based approach presented at EGRAPHS '25. Egglog's datalog-style rules let us express multi-step detection logic directly, without external Rust orchestration code. We encode parametrized constraints using \emph{Henkin-like constants} and propagate side conditions on subterms and indices through egglog facts. Finally, we show that the same ensugaring procedure also reduces large domain-set conditions before saturation, turning a problematic detection case from minutes or nontermination into a few seconds. Through these topics, we want to provide an example of an industrial application of egglog, demonstrate the trick to propagate the constraints using the ideas from mathematical logic, and show the importance of optimizing \emph{premises} of egglog rules to get practical performance in egglog programs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents two applications of egglog to mathematical optimization in JijModeling 2, whose internal representation uses simply typed λ-calculus. It first applies equality saturation with a custom cost model to ensugar higher-order stream operations (map, flat_map, filter) back into Python-comprehension syntax for more natural LaTeX output, minimizing temporary-variable rebindings. Second, it uses egglog's datalog-style rules together with Henkin-like constants to express multi-step constraint detection declaratively, propagating side conditions on subterms and indices without external Rust orchestration. The manuscript further claims that the same ensugaring step reduces large domain-set conditions, converting a problematic detection case from minutes or nontermination into seconds. The stated goals are to illustrate an industrial use of egglog, to demonstrate constraint propagation via ideas from mathematical logic, and to show the performance impact of optimizing rule premises.

Significance. If the central claims hold, the work is significant for demonstrating a practical, declarative deployment of egglog in an industrial mathematical-modeling tool. It integrates equality saturation, datalog rules, and Henkin-style constants to handle higher-order syntax reconstruction and constraint detection without imperative orchestration code. The approach supplies a concrete example of premise optimization for termination and speed, which could inform future e-graph applications in optimization and formal-methods tooling. The integration of logical constructs for side-condition propagation is a clear strength.

major comments (2)
  1. [§4] §4 (Ensugaring via equality saturation): the custom cost model that is said to minimize temporary-variable rebindings is described only at a high level; no explicit definition, pseudocode, or weighting scheme is supplied. Because the reconstruction of comprehension syntax rests on this model selecting the desired LaTeX output, the absence of the concrete cost function prevents verification that the saturation step actually produces the claimed natural notation.
  2. [§5] §5 (Constraint detection and performance): the claim that ensugaring turns a problematic large-domain detection case from minutes/nontermination into a few seconds is supported only by the reported case; no benchmark suite, termination argument, or measurements on additional industrial models are provided. This property is load-bearing for the asserted practical advantage of the declarative approach.
minor comments (2)
  1. [Abstract] Abstract: the summary of results would be strengthened by a single quantitative sentence (e.g., “runtime reduced from >300 s to 2.3 s on model X”).
  2. [§5.2] Notation: the precise encoding of Henkin-like constants and the side-condition facts should be shown with a small example derivation to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential significance of applying egglog to an industrial mathematical modeling tool. We address each major comment below. Where the comments identify gaps in detail or evidence, we agree to revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [§4] §4 (Ensugaring via equality saturation): the custom cost model that is said to minimize temporary-variable rebindings is described only at a high level; no explicit definition, pseudocode, or weighting scheme is supplied. Because the reconstruction of comprehension syntax rests on this model selecting the desired LaTeX output, the absence of the concrete cost function prevents verification that the saturation step actually produces the claimed natural notation.

    Authors: We agree that the custom cost model requires a more precise description. In the revised manuscript we will supply the explicit cost function definition, including the weighting scheme that penalizes rebinding of temporary variables. This addition will allow readers to verify how equality saturation selects the desired comprehension syntax for LaTeX output. revision: yes

  2. Referee: [§5] §5 (Constraint detection and performance): the claim that ensugaring turns a problematic large-domain detection case from minutes/nontermination into a few seconds is supported only by the reported case; no benchmark suite, termination argument, or measurements on additional industrial models are provided. This property is load-bearing for the asserted practical advantage of the declarative approach.

    Authors: We acknowledge that the performance claim rests on a single representative industrial case. In revision we will add a brief termination argument based on the finite size of the e-graph after ensugaring and will report timing results from at least one additional model. A comprehensive public benchmark suite is not feasible because many models are proprietary; we will therefore clarify the scope of the claimed improvement rather than overstate generality. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper describes concrete applications of the external egglog engine to ensugaring higher-order terms via equality saturation with a custom cost model and to multi-step constraint detection via datalog rules plus Henkin-like constants. These techniques are applied to JijModeling's lambda-calculus representation and are supported by reported performance improvements on specific cases, without any claimed result being defined in terms of itself or any prediction reducing to a fitted input by construction. The reference to prior EGRAPHS '25 work is an ordinary extension citation rather than a load-bearing self-referential justification, and the central claims remain independent of any closed logical loop within the paper itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach assumes standard properties of simply-typed lambda calculus and egglog's equality saturation semantics. No explicit free parameters are introduced; the custom cost model is the main engineering choice. Henkin-like constants are borrowed from logic rather than newly invented.

axioms (2)
  • domain assumption Equality saturation with a custom cost model will find minimal-rebinding rewrites that reconstruct comprehension syntax.
    Invoked when describing the ensugaring procedure for LaTeX output.
  • domain assumption Datalog-style rules plus Henkin-like constants can encode and propagate parametrized constraints and side conditions without external orchestration.
    Central to the constraint detection application.

pith-pipeline@v0.9.0 · 5810 in / 1417 out tokens · 47222 ms · 2026-05-20T00:56:07.369759+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

  1. [1]

    Deep Space Network Scheduling , url =

  2. [2]

    数理最適化モデル記述ツール JijModeling のご紹介 , volume =

    石井 大海 and 清水 太朗 and 寺村 俊紀 , booktitle =. 数理最適化モデル記述ツール JijModeling のご紹介 , volume =. 2025 , bdsk-url-1 =. doi:10.11517/pjsai.JSAI2025.0_4P1OS17a05 , note =

  3. [3]

    数理最適化の実用化を支援するOMMXのデータ形式標準化の取り組み , volume =

    寺村 俊紀 and 清水 太郎 and 石井 大海 , booktitle =. 数理最適化の実用化を支援するOMMXのデータ形式標準化の取り組み , volume =. 2025 , bdsk-url-1 =. doi:10.11517/pjsai.JSAI2025.0_4P1OS17a04 , note =

  4. [4]

    Trees that Grow , url =

    Shayan Najd and Simon Peyton Jones , date-added =. Trees that Grow , url =. JUCS - Journal of Universal Computer Science , number =. 2017 , bdsk-url-1 =. doi:10.3217/jucs-023-01-0042 , eprint =

  5. [5]

    Slotted E-Graphs: First-Class Support for (Bound) Variables in E-Graphs , url =

    Schneider, Rudi and Rossel, Marcus and Shaikhha, Amir and Goens, Andr\'. Slotted E-Graphs: First-Class Support for (Bound) Variables in E-Graphs , url =. Proc. ACM Program. Lang. , keywords =. 2025 , bdsk-url-1 =. doi:10.1145/3729326 , issue_date =

  6. [6]

    and Leshchinskiy, Roman and Peyton Jones, Simon and Lippmeier, Ben , booktitle =

    Keller, Gabriele and Chakravarty, Manuel M.T. and Leshchinskiy, Roman and Peyton Jones, Simon and Lippmeier, Ben , booktitle =. Regular, shape-polymorphic, parallel arrays in Haskell , url =. 2010 , bdsk-url-1 =. doi:10.1145/1863543.1863582 , isbn =

  7. [7]

    Pure Type Systems Formalized , year =

    McKinna, James and Pollack, Robert , booktitle =. Pure Type Systems Formalized , year =

  8. [8]

    Plant Location with Non-Linear Costs -- Modeling Examples --

    T. Plant Location with Non-Linear Costs -- Modeling Examples --. 2019 , bdsk-url-1 =

  9. [9]

    PEP 484 -- Type Hints , url =

    Guido van Rossum and Jukka Lehtosalo and \`. PEP 484 -- Type Hints , url =. 2014 , bdsk-url-1 =

  10. [10]

    Optimizing Optimizations: Case Study on Detecting Specific Types of Mathematical Optimization Constraints with

    Hiromi Ishii and Taro Shimizu and Toshiki Teramura , date-added =. Optimizing Optimizations: Case Study on Detecting Specific Types of Mathematical Optimization Constraints with. 2025 , bdsk-url-1 =

  11. [11]

    and Kawaguci, Ming and Jhala, Ranjit , booktitle =

    Rondon, Patrick M. and Kawaguci, Ming and Jhala, Ranjit , booktitle =. Liquid types , url =. 2008 , bdsk-url-1 =. doi:10.1145/1375581.1375602 , isbn =

  12. [12]

    Rondon, Ming Kawaguci, and Ranjit Jhala

    Rondon, Patrick M. and Kawaguci, Ming and Jhala, Ranjit , date-added =. Liquid types , url =. SIGPLAN Not. , keywords =. 2008 , bdsk-url-1 =. doi:10.1145/1379022.1375602 , issn =

  13. [13]

    Xie, Ningning and Oliveira, Bruno C. d. S. , booktitle =. Let Arguments Go First , year =

  14. [14]

    2025 , bdsk-url-1 =

    Jij-Inc/sos1-detection-benchmarks at 4e9fe48da5795694aa0010f429ea8ec944860e9b , url =. 2025 , bdsk-url-1 =

  15. [15]

    Lucas, Frontiers in Physics2 (2014), 10.3389/fphy.2014.00005, arXiv:1302.5843

    Lucas, Andrew , date-added =. 2014 , bdsk-url-1 =. doi:10.3389/fphy.2014.00005 , issn =

  16. [16]

    and Elliott, C

    Pfenning, F. and Elliott, C. , booktitle =. Higher-order abstract syntax , url =. 1988 , bdsk-url-1 =. doi:10.1145/53990.54010 , isbn =

  17. [17]

    SIGPLAN Not

    Pfenning, F. and Elliott, C. , date-added =. Higher-order abstract syntax , url =. SIGPLAN Not. , month = jun, number =. 1988 , bdsk-url-1 =. doi:10.1145/960116.54010 , issn =

  18. [18]

    Designing metamaterials with quantum a nnealing and factorization machines

    Kitai, Koki and Guo, Jiang and Ju, Shenghong and Tanaka, Shu and Tsuda, Koji and Shiomi, Junichiro and Tamura, Ryo , date-added =. Designing metamaterials with quantum annealing and factorization machines , url =. Phys. Rev. Res. , month =. 2020 , bdsk-url-1 =. doi:10.1103/PhysRevResearch.2.013319 , issue =

  19. [19]

    Column (and Row) Generation Algorithms , url =

    Laurence Wolsey , booktitle =. Column (and Row) Generation Algorithms , url =. 2020 , bdsk-url-1 =. doi:https://doi.org/10.1002/9781119606475.ch11 , eprint =

  20. [20]

    Bidirectional Typing , url =

    Dunfield, Jana and Krishnaswami, Neel , date-added =. Bidirectional Typing , url =. ACM Comput. Surv. , keywords =. 2021 , bdsk-url-1 =. doi:10.1145/3450952 , issn =

  21. [21]

    2018 , bdsk-url-1 =

    Baptista, Ricardo and Poloczek, Matthias , booktitle =. 2018 , bdsk-url-1 =

  22. [22]

    A Tutorial Implementation of a Dependently Typed Lambda Calculus , volume =

    L\". A Tutorial Implementation of a Dependently Typed Lambda Calculus , volume =. Fundam. Inf. , month = apr, number =

  23. [23]

    2025 , bdsk-url-1 =

    Traveling Salesperson Problem with QUBO , url =. 2025 , bdsk-url-1 =

  24. [24]

    COINOR Computational Infrastructure for Operations Research , title =

    Santos, Haroldo G and Toffolo, T , date-added =. COINOR Computational Infrastructure for Operations Research , title =

  25. [25]

    The Locally Nameless Representation , url =

    Chargu. The Locally Nameless Representation , url =. Journal of Automated Reasoning , number =. 2012 , bdsk-url-1 =. doi:10.1007/s10817-011-9225-2 , id =

  26. [26]

    Better Together: Unifying Datalog and Equality Saturation , url =

    Zhang, Yihong and Wang, Yisu Remy and Flatt, Oliver and Cao, David and Zucker, Philip and Rosenthal, Eli and Tatlock, Zachary and Willsey, Max , date-added =. Better Together: Unifying Datalog and Equality Saturation , url =. Proc. ACM Program. Lang. , keywords =. 2023 , bdsk-url-1 =. doi:10.1145/3591239 , issue_date =

  27. [27]

    2024 , bdsk-url-1 =

    egg\_recursive, an S-expression-free alternative interface to egg , url =. 2024 , bdsk-url-1 =

  28. [28]

    2026 , bdsk-url-1 =

    What is OMMX? -- OMMX , url =. 2026 , bdsk-url-1 =

  29. [29]

    egg: Fast and Extensible Equality Saturation , url =

    Willsey, Max and Nandi, Chandrakana and Wang, Yisu Remy and Flatt, Oliver and Tatlock, Zachary and Panchekha, Pavel , date-added =. egg: Fast and Extensible Equality Saturation , url =. Proc. ACM Program. Lang. , keywords =. 2021 , bdsk-url-1 =. doi:10.1145/3434304 , issue_date =

  30. [30]

    2026 , bdsk-url-1 =

    jijmodeling ·. 2026 , bdsk-url-1 =

  31. [31]

    2026 , bdsk-url-1 =

    What is. 2026 , bdsk-url-1 =

  32. [32]

    2025 , bdsk-url-1 =

    ommx-pyscipopt-adapter ·. 2025 , bdsk-url-1 =

  33. [33]

    Mathematical Software

    Stephen Maher and Matthias Miltenberger and Jo. Mathematical Software. 2016 , bdsk-url-1 =. doi:10.1007/978-3-319-42432-3_37 , pages =

  34. [34]

    pytest-benchmark 5.1.0 documentation , url =

    Ionel Cristian M. pytest-benchmark 5.1.0 documentation , url =. 2024 , bdsk-url-1 =