Recognition: unknown
Multiset semantics in SPARQL, Relational Algebra and Datalog
Pith reviewed 2026-05-09 19:04 UTC · model grok-4.3
The pith
SPARQL patterns using AND, UNION, FILTER, EXCEPT and SELECT are expressively equivalent to multiset Datalog and relational algebra.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We prove that these three formalisms are expressively equivalent under multiset semantics. The proof proceeds by exhibiting translations between SPARQL patterns, multiset Datalog programs and multiset relational algebra expressions that preserve the multiplicity of every result tuple for the listed operators.
What carries the argument
The explicit translations that map SPARQL patterns to multiset Datalog rules and to multiset relational algebra expressions while preserving tuple multiplicities.
If this is right
- Any SPARQL query in the fragment can be rewritten into an equivalent multiset relational algebra expression.
- Optimizations and equivalences known for multiset relational algebra apply unchanged to the corresponding SPARQL patterns.
- Non-recursive Datalog with safe negation can serve as an intermediate representation for evaluating these SPARQL queries.
- The algebraic structure of multiset SPARQL is fully characterized by the operators of the aligned Datalog and relational algebra.
Where Pith is reading between the lines
- The equivalence supplies a route for implementing SPARQL engines on top of existing Datalog or relational algebra engines that already handle multisets.
- Similar translations might be attempted for additional SPARQL operators such as OPTIONAL or property paths if suitable multiset extensions are defined.
- The result clarifies why certain duplicate-handling behaviors in SPARQL match the bag semantics already present in SQL.
Load-bearing premise
The claimed equivalence is limited to the listed SPARQL operators, non-recursive Datalog with safe negation, and the particular multiset operations defined for relational algebra in the paper.
What would settle it
A SPARQL pattern using only AND, UNION, FILTER, EXCEPT and SELECT whose result multiset cannot be produced by any non-recursive safe-negation Datalog program or by any expression in the paper's multiset relational algebra.
Figures
read the original abstract
The paper analyzes and characterizes the algebraic and logical structure of the multiset semantics for SPARQL patterns involving AND, UNION, FILTER, EXCEPT, and SELECT. To do this, we align SPARQL with two well-established query languages: Datalog and Relational Algebra. Specifically, we study (i) a version of non-recursive Datalog with safe negation extended to support multisets, and (ii) a multiset relational algebra comprising projection, selection, natural join, arithmetic union, and except. We prove that these three formalisms are expressively equivalent under multiset semantics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes multiset semantics for a SPARQL fragment consisting of AND, UNION, FILTER, EXCEPT, and SELECT patterns. It introduces a non-recursive Datalog variant with safe negation extended to bags and a multiset relational algebra with projection, selection, natural join, arithmetic union, and except. The core result is a proof of expressive equivalence among these three formalisms under multiset semantics, via translations that preserve bag cardinalities.
Significance. If the translations hold, the result is significant for database theory and semantic web query processing. It supplies a precise correspondence that lets optimization and equivalence techniques transfer between SPARQL, Datalog, and relational algebra while respecting duplicate semantics. The explicit scoping to non-recursive safe-negation Datalog and the listed operators makes the claim falsifiable and practically relevant for implementation and verification tasks.
minor comments (2)
- The paper would benefit from an explicit table or diagram summarizing the operator correspondences and translation rules across the three formalisms.
- Notation for multiset cardinality and bag union could be introduced once in a preliminary section and then used consistently to reduce reader effort when following the inductive proofs.
Simulated Author's Rebuttal
We thank the referee for the thorough summary and positive recommendation to accept the paper. We are pleased that the core contribution—the expressive equivalence of the SPARQL fragment, multiset Datalog with safe negation, and the specified multiset relational algebra under bag semantics—is recognized as significant for database theory and semantic web query processing.
Circularity Check
No significant circularity detected in the equivalence proof
full rationale
The paper's central claim is a direct proof of expressive equivalence among three independently defined formalisms (SPARQL fragment with AND/UNION/FILTER/EXCEPT/SELECT under multiset semantics, non-recursive safe-negation multiset Datalog, and the specified multiset relational algebra operators). The abstract and scope explicitly condition the result on these precise definitions and non-recursion, with no internal reduction of any prediction or uniqueness claim to a self-citation, fitted parameter, or definitional tautology. The derivation proceeds via standard inductive translations that preserve bag cardinalities, which is self-contained against external benchmarks and does not rely on load-bearing self-citations or ansatzes smuggled from prior work. This is the expected outcome for a well-scoped equivalence proof.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Standard multiset semantics for relational operations (projection, selection, join, union, except)
- domain assumption Safe negation and non-recursion in Datalog
Reference graph
Works this paper leans on
-
[1]
1995.Foundations of Databases
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995.Foundations of Databases. Addison-Wesley
1995
-
[2]
Afrati, Matthew Damigos, and Manolis Gergatsoulis
Foto N. Afrati, Matthew Damigos, and Manolis Gergatsoulis. 2010. Query Containment Under Bag and Bag-set Semantics.Inform. Process. Lett.110, 10 (2010), 360–369. https://doi.org/10.1016/j.ipl.2010.02.017
-
[3]
Joseph Albert. 1991. Algebraic Properties of Bag Data Types. InProc. of the Int. Conference on Very Large Data Bases (VLDB). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 211–219. https://dl.acm.org/doi/10.5555/645917. 672310
-
[4]
Renzo Angles, Georg Gottlob, Aleksandar Pavlović, Reinhard Pichler, and Emanuel Sallinger. 2023. SparqLog: A System for Efficient Evaluation of SPARQL 1.1 Queries via Datalog.Proc. VLDB Endow.16, 13 (2023), 4240–4253. https://doi.org/10.14778/3625054.3625061
-
[5]
Renzo Angles and Claudio Gutiérrez. 2008. The Expressive Power of SPARQL. InProc. of the International Semantic Web Conference (ISWC) (LNCS, Vol. 5318). Springer, 114–129. https://doi.org/10.1007/978-3-540-88564-1_8
-
[6]
Renzo Angles and Claudio Gutiérrez. 2016. The Multiset Semantics of SPARQL Patterns. In15th International Semantic Web Conference (ISWC) (LNCS, Vol. 9981). Springer, 20–36. https://doi.org/10.1007/978-3-319-46523-4_2
-
[7]
Renzo Angles and Claudio Gutierrez. 2016. Negation in SPARQL. InAlberto Mendelzon Int. Workshop on Foundations of Data Management (AMW). https://ceur-ws.org/Vol-1644/paper11.pdf
2016
-
[8]
Luigi Bellomarini, Emanuel Sallinger, and Georg Gottlob. 2018. The Vadalog system: datalog-based reasoning for knowledge graphs.Proc. VLDB Endow.11, 9 (2018), 975–987. https://doi.org/10.14778/3213880.3213888
-
[9]
Leopoldo Bertossi, Georg Gottlob, and Reinhard Pichler. 2019. Datalog: Bag Semantics via Set Semantics. InInternational Conference on Database Theory (ICDT), Vol. 127. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 16:1–16:19. https://doi.org/10.4230/LIPIcs.ICDT.2019.16
-
[10]
Val Breazu-Tannen and Ramesh Subrahmanyam. 1991. Logical and computational aspects of programming with sets/bags/lists. InAutomata, Languages and Programming. Springer Berlin Heidelberg, Berlin, Heidelberg, 60–75
1991
-
[11]
Artem Chebotko, Shiyong Lu, and Farshad Fotouhi. 2009. Semantics preserving SPARQL-to-SQL translation.Data & Knowledge Engineering68, 10 (2009), 973–1000. https://doi.org/10.1016/j.datak.2009.04.001
-
[12]
Sara Cohen. 2009. Equivalence of Queries That Are Sensitive to Multiplicities.The VLDB Journal18, 3 (June 2009), 765–785. https://doi.org/10.1007/s00778-008-0122-1
-
[13]
Latha S. Colby and Leonid Libkin. 1997. Tractable iteration mechanisms for bag languages. InInternational Conferencia on Database Theory (ICDT) (LNCS, Vol. 1186). Springer Berlin Heidelberg, Berlin, Heidelberg, 461–475. https://doi.org/ 10.1007/3-540-62222-5_64 38 Angles et al
-
[14]
Marco Console, Paolo Guagliardo, and Leonid Libkin. 2022. Fragments of bag relational algebra: Expressiveness and certain answers.Information Systems(2022). https://doi.org/10.1016/j.is.2020.101604
-
[15]
2005.A relational algebra for SPARQL
Richard Cyganiak. 2005.A relational algebra for SPARQL. Technical Report HPL-2005-170. HP Labs
2005
-
[16]
C. J. Date. 2006.Date on Database: Writings 2000-2006. APress
2006
-
[17]
Umeshwar Dayal, Nathan Goodman, and Randy H. Katz. 1982. An Extended Relational Algebra with Control over Duplicate Elimination. InProc. of the Symposium on Principles of Database Systems (PODS). ACM, 117–123. https://doi.org/10.1145/588111.588132
-
[18]
Floris Geerts, Thomas Unger, Grigoris Karvounarakis, Irini Fundulaki, and Vassilis Christophides. 2016. Algebraic Structures for Capturing the Provenance of SPARQL Queries.J. ACM63, 1 (2016), 63 pages. https://doi.org/10.1145/ 2810037
2016
-
[19]
Todd J. Green. 2009. Bag Semantics. InEncyclopedia of Database Systems. 201–206
2009
-
[20]
Stéphane Grumbach, Leonid Libkin, Tova Milo, and Limsoon Wong. 1996. Query languages for bags: expressive power and complexity.SIGACT News27, 2 (1996), 30–44. https://doi.org/10.1145/235767.235770
-
[21]
Stéphane Grumbach and Tova Milo. 1996. Towards Tractable Algebras for Bags.J. Comput. System Sci.52, 3 (1996), 570–588. https://doi.org/10.1006/jcss.1996.0042
-
[22]
Paolo Guagliardo and Leonid Libkin. 2017. A Formal Semantics of SQL Queries, Its Validation, and Applications.Proc. VLDB Endow.11, 1 (2017), 27–39. https://doi.org/10.14778/3151113.3151116
-
[23]
Steve Harris and Andy Seaborne. 2013. SPARQL 1.1 Query Language - W3C Recommendation. http://www.w3.org/TR/2013/REC-sparql11-query-20130321/
2013
-
[24]
12 Jean-Baptiste Courtois and Sylvain Schmitz
A. Hernich and P. G. Kolaitis. 2017. Foundations of information integration under bag semantics. In32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). 1–12. https://doi.org/10.1109/LICS.2017.8005104
-
[25]
2020.The Problem of Incomplete Data in SPARQL
Daniel Hernández. 2020.The Problem of Incomplete Data in SPARQL. Ph.D. dissertation. Universidad de Chile - Faculty of Physical and Mathematical Sciences, Santiago, Chile. https://repositorio.uchile.cl/handle/2250/178033
2020
-
[26]
Aidan Hogan, Marcelo Arenas, Alejandro Mallea, and Axel Polleres. 2014. Everything You Always Wanted to Know About Blank Nodes.Journal of Web Semantics27, 1 (2014). https://doi.org/10.1016/j.websem.2014.06.004
-
[27]
Kostylev, and Bernardo Cuenca Grau
Mark Kaminski, Egor V. Kostylev, and Bernardo Cuenca Grau. 2016. Semantics and Expressive Power of Subqueries and Aggregates in SPARQL 1.1. InProc. of the International Conference on World Wide Web. 227–238
2016
-
[28]
Kostylev, and Bernardo Cuenca Grau
Mark Kaminski, Egor V. Kostylev, and Bernardo Cuenca Grau. 2016. Semantics and Expressive Power of Subqueries and Aggregates in SPARQL 1.1.. InProceedings of the Int. Conference on World Wide Web (WWW). ACM, 227–238. https://doi.org/10.1145/2872427.2883022
-
[29]
Aviel Klausner and Nathan Goodman. 1985. Multirelations - Semantics and languages. InProc. of Int. Conference on Very Large Data Bases (VLDB). VLDB Endowment, 251–258. https://dl.acm.org/doi/10.5555/1286760.1286783
-
[30]
Roman Kontchakov and Egor V. Kostylev. 2016. On Expressibility of Non-Monotone Operators in SPARQL. InInt. Conference on the Principles of Knowledge Representation and Reasoning. AAAI Press, 369–378. https://dl.acm.org/doi/ 10.5555/3032027.3032071
-
[31]
G. Lamperti, M. Melchiori, and M. Zanella. 2001. On Multisets in Database Systems. InProceedings of the Workshop on Multiset Processing. 147–216. https://dl.acm.org/doi/10.5555/647269.721839
-
[32]
Leonid Libkin and Limsoon Wong. 1994. Some Properties of Query Languages for Bags. InProc. of the Int. Workshop on Database Programming Languages (DBPL) - Object Models and Languages. 97–114. https://doi.org/10.1007/978-1- 4471-3564-7_7
-
[33]
Leonid Libkin and Limsoon Wong. 1997. Query languages for bags and aggregate functions.J. Comput. System Sci.55, 2 (1997), 241–272. https://doi.org/10.1006/jcss.1997.1523
-
[34]
J. W. Lloyd. 1998.Programming with multisets. Technical Report. University of Bristol
1998
-
[35]
Melton and A
J. Melton and A. R. Simon. 2002.SQL:1999. Understanding Relational Language Components. Morgan Kaufmann Publ
2002
-
[36]
I. S. Mumick, S. J. Finkelstein, Hamid Pirahesh, and Raghu Ramakrishnan. 1990. Magic is Relevant.SIGMOD Rec.19, 2 (1990), 247–258. https://doi.org/10.1145/93605.98734
-
[37]
Inderpal Singh Mumick, Hamid Pirahesh, and Raghu Ramakrishnan. 1990. The Magic of Duplicates and Aggregates. InProc. of the International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 264–277. https://dl.acm.org/doi/10.5555/94362.94426
-
[38]
2006.Semantics of SPARQL
Jorge Pérez, Marcelo Arenas, and Claudio Gutierrez. 2006.Semantics of SPARQL. Technical Report TR/DCC-2006-17. Department of Computer Science, University of Chile
2006
-
[39]
A. Polleres. 2007. From SPARQL to Rules (and back). InProceedings of the 16th Int. World Wide Web Conference (WWW). ACM, 787–796. https://doi.org/10.1145/1242572.1242679
-
[40]
Axel Polleres and Johannes Peter Wallner. 2013. On the relation between SPARQL1.1 and Answer Set Programming. Journal of Applied Non-Classical Logics23, 1-2 (2013), 159–212. https://doi.org/10.1080/11663081.2013.798992
-
[41]
Eric Prud’hommeaux and Andy Seaborne. 2008. SPARQL Query Language for RDF. W3C Recommendation. http://www.w3.org/TR/2008/REC-115-sparql-query-20080115/. The multiset semantics of SPARQL patterns 39
2008
-
[42]
Wilmer Ricciotti and James Cheney. 2019. Mixing Set and Bag Semantics. InProc. 17th ACM SIGPLAN International Symposium on Database Programming Languages (DBPL)(Phoenix, AZ, USA). ACM, New York, NY, USA, 70–73. https://doi.org/10.1145/3315507.3330202
-
[43]
Simon Schenk. 2007. A SPARQL Semantics Based on Datalog. InAnnual German Conference on Advances in Artificial Intelligence, Vol. 4667. 160–174. https://doi.org/10.1007/978-3-540-74565-5_14
-
[44]
Michael Schmidt, Michael Meier, and Georg Lausen. 2010. Foundations of SPARQL query optimization. InProc. of the Int. Conference on Database Theory. ACM, 4–33. https://doi.org/10.1145/1804669.1804675
-
[45]
Xiaowang Zhang and Jan Van den Bussche. 2014. On the primitivity of operators in SPARQL.Inf. Process. Lett.114, 9 (2014), 480–485. https://doi.org/10.1016/j.ipl.2014.03.014 A Variable renaming in SPARQL This appendix section defines function subs(·,·), which renames SPARQL variables. This function is used to simulate the MRA operator renaming 𝜌𝐴/𝐵 (see Ta...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.