pith. sign in

arxiv: 2606.19751 · v1 · pith:3V2GNUHZnew · submitted 2026-06-18 · 💻 cs.DB · math.OC

DeQL: A Decision Query Language for Prescriptive Analytics over Relational Data

Pith reviewed 2026-06-26 15:37 UTC · model grok-4.3

classification 💻 cs.DB math.OC
keywords decision query languageprescriptive analyticsSQL extensionoptimization over relational dataCREATE CANDIDATESDECIDE constructrelational queries
0
0 comments X

The pith

DeQL extends SQL with CREATE CANDIDATES and DECIDE to express and solve optimization problems over relational data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DeQL as an extension to SQL for writing decision queries that find optimal actions from database options under constraints. It aims to let users state what to optimize without specifying how, similar to how SQL separates query from execution. If true, this would allow embedding prescriptive analytics directly into database queries for tasks like allocation and scheduling. The design keeps the problem structure visible and ensures queries produce relations. This matters because it could make optimization accessible within standard data workflows without separate tools.

Core claim

DeQL adds CREATE CANDIDATES to define the space of options from relational sources and DECIDE to declare decision variables, constraints, and an objective. The language follows SQL principles by having the user declare the optimization problem while the engine chooses the solver, with every query consuming and producing relations and the problem structure remaining visible to the engine. The specification covers syntax, grammar, execution model, and examples across subset selection, allocation, assignment, scheduling, and multi-level aggregation, plus extensions for uncertainty and bounded solving.

What carries the argument

The CREATE CANDIDATES and DECIDE constructs, which define the option space from relational sources and declare the optimization objective over decision variables while preserving relational input-output semantics.

Load-bearing premise

The engine can choose how to solve the optimization problem declared in the query while every query consumes and produces relations and the problem structure remains visible to the engine.

What would settle it

An implementation attempt where no solver selection is possible without the user specifying solver details or where the output fails to be a relation.

read the original abstract

DeQL (Decision Query Language) extends SQL to express decision queries: given options drawn from relational data, constraints from policy, and a measurable objective, a DeQL query computes the best course of action. Two constructs carry the extension: CREATE CANDIDATES, which defines the space of options from relational sources, and DECIDE, which declares decision variables, named constraints, and an objective over them. The design follows SQL's principles: the user states what to optimize while the engine chooses how to solve it, every query consumes and produces relations, and the structure of a problem stays visible to the engine. This document specifies the language (its design principles, syntax, formal grammar, and execution model) with examples spanning subset selection, allocation, assignment, scheduling, and decisions at multiple levels of aggregation, and extensions for optimization under uncertainty, inline model scoring, and time- and quality-bounded solving. It is the first version of the specification; the language is under active development, and this version fixes the core constructs on which later revisions will build.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes DeQL, an extension of SQL for prescriptive analytics over relational data. It introduces CREATE CANDIDATES to define option spaces from relational sources and DECIDE to declare decision variables, named constraints, and objectives. The specification covers design principles (user declares optimization, engine selects solver, queries are relational, problem structure visible), syntax, formal grammar, execution model, examples across subset selection, allocation, assignment, scheduling, and multi-level aggregation, plus extensions for uncertainty, inline scoring, and bounded solving. It is presented as the first version of the language specification.

Significance. If the design principles can be realized with a correct implementation, DeQL could enable declarative optimization queries within database systems, reducing the gap between data management and decision-making tools. The relational input/output model and visibility of problem structure are potentially valuable for integration with existing query engines, but the current lack of formal semantics, implementation, or validation means the significance cannot yet be assessed beyond the specification itself.

major comments (2)
  1. [Abstract and Execution Model] Abstract and Execution Model: The central claim that 'every query consumes and produces relations' and that 'the structure of a problem stays visible to the engine' is load-bearing for the SQL-like design principles, yet the manuscript provides only an informal execution model without formal semantics (e.g., no denotational or operational semantics defining how DECIDE results are mapped to output relations while preserving problem structure). This absence prevents verification of the claims.
  2. [Syntax, Grammar, and Examples sections] The specification asserts usability and correctness of the design for decision problems (e.g., scheduling and multi-level aggregation examples) but contains no proofs, type system, or safety properties for the grammar, nor any implementation or empirical validation. This directly affects the soundness of the language proposal as a whole.
minor comments (1)
  1. [Examples] The examples could include more precise syntax diagrams or pseudocode to illustrate how CREATE CANDIDATES and DECIDE interact with standard SQL clauses.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive review and for recognizing the potential value of DeQL if its design principles are realized. This manuscript presents the first version of a language specification; we address the major comments below by clarifying scope and committing to targeted revisions where they strengthen the specification without altering its nature as a design document.

read point-by-point responses
  1. Referee: [Abstract and Execution Model] The central claim that 'every query consumes and produces relations' and that 'the structure of a problem stays visible to the engine' is load-bearing for the SQL-like design principles, yet the manuscript provides only an informal execution model without formal semantics (e.g., no denotational or operational semantics defining how DECIDE results are mapped to output relations while preserving problem structure). This absence prevents verification of the claims.

    Authors: We agree that the execution model is presented at an informal level and lacks a full formal semantics. The manuscript describes an operational model in which CREATE CANDIDATES produces a relation of options, DECIDE consumes that relation plus declared constraints/objectives, and the result is emitted as a relation of decisions; problem structure is preserved by exposing variables and constraints as first-class relational artifacts. To address the concern, the revised manuscript will expand the Execution Model section with a more precise operational semantics, including explicit rules for relation consumption/production and structure visibility. A complete denotational semantics is beyond the scope of this initial specification but can be noted as future work. revision: partial

  2. Referee: [Syntax, Grammar, and Examples sections] The specification asserts usability and correctness of the design for decision problems (e.g., scheduling and multi-level aggregation examples) but contains no proofs, type system, or safety properties for the grammar, nor any implementation or empirical validation. This directly affects the soundness of the language proposal as a whole.

    Authors: The paper supplies a formal grammar but does not include a type system, safety proofs, implementation, or empirical validation; these omissions are consistent with its role as an initial language specification rather than a verification or systems paper. The examples illustrate expressiveness across problem classes but do not constitute formal validation. In revision we will add a basic static type system for the grammar and informal discussion of safety properties derivable from it. Implementation and empirical evaluation remain outside the current scope and are planned for subsequent papers once the language stabilizes. revision: partial

standing simulated objections not resolved
  • Provision of a full implementation or empirical validation, as the manuscript is explicitly a language specification document without an accompanying system.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The document is a language specification defining syntax, grammar, and execution model for DeQL via CREATE CANDIDATES and DECIDE constructs. It contains no derivations, equations, predictions, fitted parameters, or first-principles results that could reduce to their inputs by construction. All claims are definitional statements about the language design, with no self-citation load-bearing steps, uniqueness theorems, or ansatzes invoked. The central premise (user declares optimization, engine selects solver, relational inputs/outputs) is a design principle stated directly, not derived from prior results within the paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The design rests on the assumption that an underlying solver can handle the declared problems while preserving relational semantics; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption The engine chooses how to solve the optimization while every query consumes and produces relations and the problem structure stays visible.
    Stated as a core design principle in the abstract.

pith-pipeline@v0.9.1-grok · 5726 in / 1153 out tokens · 23908 ms · 2026-06-26T15:37:20.494204+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

11 extracted references · 6 canonical work pages

  1. [1]

    Package queries: Efficient and scalable computation of high-order constraints.The VLDB Journal, 27(5):693–718, 2018

    Matteo Brucato, Azza Abouzied, and Alexandra Meliou. Package queries: Efficient and scalable computation of high-order constraints.The VLDB Journal, 27(5):693–718, 2018. doi: 10.1007/ s00778-017-0483-4

  2. [2]

    Haas, and Alexandra Meliou

    Matteo Brucato, Nishant Yadav, Azza Abouzied, Peter J. Haas, and Alexandra Meliou. Stochastic package queries in probabilistic databases. InProceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, pages 269–283. ACM, 2020. doi: 10.1145/3318464. 3389765

  3. [3]

    DeQL Studio: Declar- ative decision-making over relational data

    Matteo Brucato, Fjodor Kholodkov, Soren Little, Jakob Mayer, and Duc Nguyen. DeQL Studio: Declar- ative decision-making over relational data. InProceedings of the VLDB Endowment, 2026

  4. [4]

    Decisionhouse: The data system that decides

    Matteo Brucato, Fjodor Kholodkov, Soren Little, Jakob Mayer, and Duc Nguyen. Decisionhouse: The data system that decides. InProceedings of the VLDB Endowment, 2026

  5. [5]

    Dantzig, D

    George B. Dantzig, D. Ray Fulkerson, and Selmer M. Johnson. Solution of a large-scale traveling- salesman problem.Operations Research, 2(4):393–410, 1954. doi: 10.1287/opre.2.4.393

  6. [6]

    JuMP: A modeling language for mathematical opti- mization.SIAM Review, 59(2):295–320, 2017

    Iain Dunning, Joey Huchette, and Miles Lubin. JuMP: A modeling language for mathematical opti- mization.SIAM Review, 59(2):295–320, 2017. doi: 10.1137/15M1020575

  7. [7]

    Gay, and Brian W

    Robert Fourer, David M. Gay, and Brian W. Kernighan.AMPL: A Modeling Language for Mathematical Programming. Brooks/Cole, 2nd edition, 2003

  8. [8]

    Introduction to ML in BigQuery.https://docs.cloud.google.com/bigquery/ docs/bqml-introduction, 2026

    Google Cloud. Introduction to ML in BigQuery.https://docs.cloud.google.com/bigquery/ docs/bqml-introduction, 2026. Accessed: 2026-06-17

  9. [9]

    Hart, Jean-Paul Watson, and David L

    William E. Hart, Jean-Paul Watson, and David L. Woodruff. Pyomo: modeling and solving math- ematical programs in Python.Mathematical Programming Computation, 3(3):219–260, 2011. doi: 10.1007/s12532-011-0026-8

  10. [10]

    Stuckey, Ralph Becket, Sebastian Brand, Gregory J

    Nicholas Nethercote, Peter J. Stuckey, Ralph Becket, Sebastian Brand, Gregory J. Duck, and Guido Tack. MiniZinc: Towards a standard CP modelling language. In Christian Bessiere, editor,Principles and Practice of Constraint Programming – CP 2007, volume 4741 ofLecture Notes in Computer Science, pages 529–543. Springer, 2007. doi: 10.1007/978-3-540-74970-7\_38

  11. [11]

    SolveDB: Integrating optimization problem solvers into SQL databases

    Laurynas Siksnys and Torben Bach Pedersen. SolveDB: Integrating optimization problem solvers into SQL databases. InProceedings of the 28th International Conference on Scientific and Statistical Database Management, pages 14:1–14:12. ACM, 2016. doi: 10.1145/2949689.2949693. OSM Data 29 Version 0.1•June 2026