Recognition: unknown
SEMA-SQL: Beyond Traditional Relational Querying with Large Language Models
Pith reviewed 2026-05-08 05:07 UTC · model grok-4.3
The pith
SEMA-SQL generates efficient hybrid queries that combine relational operations with LLM semantic functions to answer natural language questions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SEMA-SQL automates the generation of Hybrid Relational Algebra queries from natural language, optimizes them through cost-based transformations and UDF rewriting, and executes them efficiently with batching that reduces LLM invocations by an average of 93 percent in semantic joins, enabling reliable answers to questions that require both structured relational operations and semantic reasoning.
What carries the argument
Hybrid Relational Algebra (HRA), a declarative extension of relational algebra that incorporates LLM user-defined functions specified in natural language for semantic operations such as joins, mappings, and aggregations.
If this is right
- Hybrid queries can perform semantic joins and mappings across entities with inconsistent names or requiring external knowledge extraction.
- Intelligent batching reduces the number of LLM invocations by an average of 93 percent during execution of semantic joins.
- Natural language questions can be answered without users manually constructing pipelines of semantic operators.
- Query capabilities expand beyond text-to-SQL systems by directly embedding LLM semantic reasoning into the algebra.
Where Pith is reading between the lines
- The approach could support more natural database interfaces for users who lack SQL expertise but need semantic analysis.
- Batching strategies developed here might generalize to reduce costs in other LLM-augmented data processing pipelines.
- Integration with fine-tuned models specialized for database semantics could further lower error rates in production settings.
- Similar hybrid algebra designs could be explored for non-relational systems such as graph or document databases.
Load-bearing premise
Large language models can reliably execute semantic user-defined functions from natural language specifications via in-context learning without introducing error rates or hallucinations that invalidate the hybrid results.
What would settle it
Execution of benchmark queries where LLM semantic operations produce incorrect outputs that cause the overall hybrid results to deviate from ground truth beyond acceptable thresholds.
Figures
read the original abstract
Relational databases excel at structured data analysis, but real-world queries increasingly require capabilities beyond standard SQL, such as semantically matching entities across inconsistent names, extracting information not explicitly stored in schemas, and analyzing unstructured text. While text-to-SQL systems enable natural language querying, they remain limited to relational operations and cannot leverage the semantic reasoning capabilities of modern large language models (LLMs). Conversely, recent semantic operator systems extend relational algebra with LLM-powered operations (e.g., semantic joins, mappings, aggregations), but require users to manually construct complex query pipelines. To address this gap, we present SEMA-SQL, a system that automatically answers natural language questions by generating efficient queries that combine relational operations with LLM semantic reasoning. We formalize Hybrid Relational Algebra (HRA), a declarative abstraction unifying traditional relational operators with LLM user-defined functions (UDFs). The system automates three critical aspects: (1) query generation via in-context learning that produces HRA queries with precise natural language specifications for LLM UDFs, (2) query optimization via cost-based transformations and UDF rewriting, and (3) efficient execution algorithms that reduce LLM invocations by an average of 93% in semantic joins through intelligent batching. Extensive experiments with known benchmarks, and extensions thereof, demonstrate the significant query capability improvements possible with our design.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents SEMA-SQL, a system that answers natural language questions over relational databases by automatically generating queries in a Hybrid Relational Algebra (HRA). HRA extends traditional relational algebra with LLM-powered user-defined functions (UDFs) for semantic operations such as joins, mappings, and aggregations on unstructured or inconsistently named data. The system automates three aspects: (1) query generation via in-context learning to produce HRA expressions with natural-language specifications for the LLM UDFs, (2) cost-based query optimization and UDF rewriting, and (3) efficient execution algorithms that batch LLM calls to achieve an average 93% reduction in invocations for semantic joins. Experiments on standard benchmarks and extensions thereof are claimed to show substantial gains in query capability over both pure SQL and text-to-SQL baselines.
Significance. If the experimental claims hold under rigorous validation, the work would provide a useful declarative bridge between structured relational processing and LLM semantic reasoning, reducing the need for manual construction of hybrid pipelines. The reported 93% reduction via batching represents a concrete engineering contribution to execution efficiency. The formalization of HRA offers a clean abstraction that could serve as a foundation for future hybrid query languages. Credit is due for the focus on automation of generation, optimization, and execution rather than leaving these to the user.
major comments (2)
- [Experimental Evaluation] Experimental Evaluation section: the central claim of 'significant query capability improvements' and the 93% average reduction in LLM invocations rests on benchmark results, yet the section provides insufficient detail on error rates, hallucination frequency, or accuracy of the LLM UDF outputs against ground truth. Without these metrics and controls (e.g., comparison of hybrid results to manually verified answers), it is impossible to determine whether the reported capability gains are offset by unacceptable semantic errors.
- [Query Optimization] Query Optimization and Execution sections: the cost model used for deciding when to apply UDF rewriting and batching is not specified in sufficient detail. In particular, it is unclear how the optimizer estimates the latency, monetary cost, or failure probability of LLM UDF calls relative to relational operators; this estimation is load-bearing for the claim that the generated plans are both correct and efficient.
minor comments (2)
- [Related Work] The related-work discussion of prior semantic-operator systems could be expanded with explicit side-by-side comparison of query expressiveness and automation level.
- [Hybrid Relational Algebra] Notation for HRA operators is introduced without a compact summary table; adding one would improve readability when the algebra is referenced in later sections.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper accordingly to strengthen the presentation of our experimental results and the details of our cost model.
read point-by-point responses
-
Referee: [Experimental Evaluation] Experimental Evaluation section: the central claim of 'significant query capability improvements' and the 93% average reduction in LLM invocations rests on benchmark results, yet the section provides insufficient detail on error rates, hallucination frequency, or accuracy of the LLM UDF outputs against ground truth. Without these metrics and controls (e.g., comparison of hybrid results to manually verified answers), it is impossible to determine whether the reported capability gains are offset by unacceptable semantic errors.
Authors: We agree that additional quantitative metrics on accuracy, error rates, and hallucination frequency are needed to fully substantiate the capability claims. In the revised manuscript, we will expand the Experimental Evaluation section to include accuracy rates of LLM UDF outputs measured against ground truth, observed hallucination frequencies across benchmarks, and results from manual verification of sampled hybrid query results. These additions will allow readers to evaluate whether semantic errors offset the reported gains. revision: yes
-
Referee: [Query Optimization] Query Optimization and Execution sections: the cost model used for deciding when to apply UDF rewriting and batching is not specified in sufficient detail. In particular, it is unclear how the optimizer estimates the latency, monetary cost, or failure probability of LLM UDF calls relative to relational operators; this estimation is load-bearing for the claim that the generated plans are both correct and efficient.
Authors: We acknowledge that the cost model description requires more detail. We will revise the Query Optimization section to fully specify the cost model, including explicit formulas and methods for estimating latency, monetary costs, and failure probabilities of LLM UDF calls relative to traditional relational operators. We will also describe how these estimates are calibrated and used in plan selection. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper describes a system architecture (SEMA-SQL) that formalizes Hybrid Relational Algebra (HRA) and automates query generation, optimization, and execution with LLM UDFs. It reports empirical improvements on benchmarks without any mathematical derivation chain, fitted parameters presented as predictions, self-definitional constructs, or load-bearing self-citations. The abstract and high-level claims rest on a proposed design and experimental results that are independent of the inputs by construction. No equations or reductions to prior fitted quantities appear.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Hybrid Relational Algebra extends standard relational algebra by allowing LLM-powered user-defined functions whose semantics are specified in natural language.
invented entities (1)
-
Hybrid Relational Algebra (HRA)
no independent evidence
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.