SQL Nulls and Two-Valued Logic

4Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The design of SQL is based on a three-valued logic (3VL), rather than the familiar two-valued Boolean logic (2VL). In addition totrue andfalse, 3VL addsunknown to handle nulls. Viewed as indispensable for SQL expressiveness, it is often criticized for unintuitive behavior of queries and for being a source of programmer mistakes. We show that, contrary to the widely held view, SQL could have been designed based on 2VL, without any loss of expressiveness. Similarly to SQL's WHERE clause, which only keeps true tuples, we conflate false and unknown for conditions involving nulls to obtain an equally expressive 2VL-based version of SQL. This applies to the core of the 1999 SQL Standard. Queries written under the 2VL semantics can be efficiently translated into the 3VL SQL and thus executed on any existing RDBMS. We show that 2VL enables additional optimizations. To gauge its applicability, we establish criteria under which 2VL and 3VL semantics coincide, and analyze common benchmarks such as TPC-H and TPC-DS to show that most of their queries are such. For queries that behave differently under 2VL and 3VL, we undertake a user study to show a consistent preference for the 2VL semantics.

Cite

CITATION STYLE

APA

Libkin, L., & Peterfreund, L. (2023). SQL Nulls and Two-Valued Logic. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (pp. 11–20). Association for Computing Machinery. https://doi.org/10.1145/3584372.3588661

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free