How, Where, and Why Data Provenance Improves Query Debugging -- A Visual Demonstration of Fine-Grained Provenance Analysis for SQL

Tobias Müller, Pascal Engel

Proceedings of the 38th IEEE Int'l Conference on Data Engineering (ICDE 2022), Kuala Lumpur, Malaysia, May 2022. (to be published)

Data provenance is meta-information about the origin and processing history of data. We demonstrate the provenance analysis of SQL queries and use it for query debugging. How-provenance determines which query expressions have been relevant for evaluating selected pieces of output data. Likewise, Where- and Why-provenance determine relevant pieces of input data. The combined provenance notions can be explored visually and interactively. We support a feature-rich SQL dialect with correlated subqueries and focus on bag semantics. Our fine-grained provenance analysis derives individual data provenance for table cells and SQL expressions.