Provenance for SQL Based on Abstract Interpretation: Value-less, but Worthwhile

Tobias MüllerTorsten Grust

Proceedings of the 41st Int’l Conference on Very Large Databases (VLDB 2015), Kohala Coast, Hawaii, USA, August 2015.


We demonstrate the derivation of fine-grained where- and why-provenance for a rich dialect of SQL that includes recursion, (correlated) subqueries, windows, grouping/aggregation, and the RDBMS’s library of built-in functions. The approach relies on ideas that originate in the programming language community—program slicing and abstract interpretation, in particular. A two-stage process first records a query’s control flow decisions and locations of data access before it derives provenance without consultation of the actual data values (rendering the method largely “value-less”). We will bring an interactive demonstrator that uses this provenance information to make input/output dependencies in real-world SQL queries tangible.