Database Research Group

WSI – Database Systems Research Group

Advanced SQL


News
  • Sep 11, 2017 — Hallo Zusammen,

    die Wiederholungsprüfungen zur Veranstaltung der ASQL (SS17) werden am Vormittag des Mittwoch, den 11.10.2017 statt finden. Die Prüfungen werden mündlich sein. Bitte meldet euch verbindlich bis spätestens 29.09.2017 bei mir zur Prüfung mit einer kurzen E-Mail an. Die Vergabe konkreter Termine erfolgt im Anschluss. Eine separate Anmeldung im Campus System ist nicht erforderlich.

    Zur Teilnahme zugelassen sind all diejenigen, welche die Hauptklausur nicht bestanden haben oder dieser entschuldigt fern geblieben sind.

    Viele Grüße,

    -- Chris — Christian Duta

  • Apr 17, 2017 — First lecture on Tuesday, April 18, 10:15 in room C215. First tutorial on Thursday, April 27, when we have collected the first batch of interesting material. — Torsten Grust

  • Apr 20, 2017 — The forum is now accessible via this link. We will use it, to communicate details and discussions about the lecture, weekly assignments and other related topics. — Christian Duta

  • Jul 18, 2017 — Exam Details

    Setup

    • The exam will held be at 2017/07/25 10 c.t. in Room A301.
    • Exam duration is 90 minutes.
    • Exam assignments are given in English.
    • Answers can be written in English or German.
    • Please arrive about 15 minutes early, so everyone can be seated ahead of time.

    Requirements

    • You need at least 66% of the total exercise points to participate in the exam.
    • For exercise points beyond the 66% mark we will award bonus points in the exam. With 100% of the total exercise points, you'll receive 25% of the exam points as a bonus.
    • You will receive bonus points only if your exam is graded with a 4.0 or higher. Failing the exam means failing the course, no matter the bonus points.

    Materials

    • Your student identity card.
    • Only pens are allowed as writing tools.
    • You are allowed to take one Din A4 paper with notes (handwritten or otherwise) with you.

    For any further questions, please feel free to post them in the forum.

    Christian Duta


Relational database systems provide efficient storage for large volumes of data. This course highlights that these systems also provide a versatile and expressive data processing language: SQL. There's much more to SQL than the plain SELECT...FROM...WHERE clause and we will see that a surprisingly large number of algorithmic problems can be tackled using SQL. Moving computation close to the data is key to unlock the true potential of database systems.

Selected course topics include

  • common table expressions (WITH),
  • non-standard data types (arrays, geometric data, JSON, XML),
  • table functions,
  • window functions,
  • recursive computation,
  • user-defined SQL procedures (PL/SQL),
  • index design for complex SQL queries,
  • off-beat SQL applications, useful SQL idioms, and fun SQL puzzles.

The course will only provide a brief introduction to the fundamental aspects of relational database systems. We expect you to have basic SQL skills (through prior attendance of Datenbanksysteme I or personal projects, for example) or be willing to acquire such skills.

Join us for a boatload of SQL fun! We will provide more course details once the semester approaches.


Slides
NrChapterDownload
1Welcome
  • Administrativa
  • Fundamentals of the tabular (relational) data model
pdf
2The Core of SQL
  • A tour of core SQL constructs
  • Query conventions in this course

[ updated: April 25, 2017, 13:54 ]

pdf
3Standard and Non-Standard Data Types
  • Type casts (in particular from type text)
  • The variety of types of values that may be stored in table cells:
    • text and numeric data
    • ranges
    • user-defined enumerated types
    • dates, times, timestamps, and intervals
    • bit strings
    • binary large objects (BLOBs)
    • geometric objects
    • JSON and XML documents
    • sequences

[ updated: May 9, 2017, 13:41 ]

pdf
4Arrays and User-Defined Functions
  • The type τ[] (or τ array)
  • Computation over arrays
  • Array unnesting and aggregation
  • Table-generating functions
  • User-defined SQL functions (UDFs)
  • LATERAL (sideways row variable passing)
  • Sample problem (Finding Seats)

[ updated: May 22, 2017, 22:27 ]

pdf
5Window Functions
  • Window Frames
  • ROWS and RANGE frames
  • WINDOW clause
  • Partitioning
  • Sample problem (Weekend Weather)
  • Sample problem (Visibility in the Hills)
  • Scans
  • Window Functions
  • Sessionization
  • Run-Length Encoding
  • Landscape Features
  • Numbering and Ranking Rows
  • Consecutive Ranges
  • Piecewise Linear Approximations

[ updated June 20, 2017, 12:40 ]

pdf
6Recursion
  • Expressive power of SQL and recursion
  • WITH RECURSIVE
  • Self-referential queries
  • Set vs. bag semantics (UNION vs. UNION ALL)
  • Home-made generate_series()
  • Tree traversals
  • (Non-)termination
  • Connected components in a graph
  • Recursive text processing (regular expression matching)
  • Bag semantics and termination
  • Recursive array processing (Sudoku)
  • Loose index scans
  • K-Means clustering
  • Marching squares (control flow to data flow)
  • Cellular automata (Game of Life, Liquid flow)
  • Parsing context-free grammars (CYK algorithm)
pdf
7Procedural SQL
  • PL/SQL = Scripting + SQL
  • Saving turnarounds
  • Blocks, Statements, Expressions
  • Invoking Queries, populating tables
  • Implementation of a spreadsheet core (JSON-based formula representation, dependency extraction, topological sorting, recursive formula evaluation)
pdf
Additional material (code, data)
NrFileDownload
1Handout

Instructions for participating in the "Advanced SQL" exercises.

pdf
2core-of-SQL.sql

A collection of example SQL queries, formulated using the core constructs of SQL.

Usage:

  • Cut & paste individual table definition and queries into your psql PostgreSQL shell (just like in the lecture), or
  • use psql -d ‹database› -f core-of-SQL.sql to execute the entire file at once (probably less useful).
sql
3data-types.sql

Examples that demonstrate the use of a variety of SQL data types.

Usage:

  • Cut & paste individual table definition and queries into your psql PostgreSQL shell (just like in the lecture), or
  • use psql -d ‹database› -f data-types.sql to execute the entire file at once (probably less useful).
sql
4glados.sql

Demonstrates the use of BLOBs (binary large objects) stored in columns of type bytea.

Notes:

  • User-defined function read_blob() opens and reads the specified file (you need to provide an absolute path to the file) and returns a bytea value.

  • Adapt the blob_path variable to point a directory (absolute path) where your multi-media objects (e.g. audio/video/image files) are located.

  • The .wav files with GlaDOS quotes are not included in this file. You'll find countless examples for download in the Portal Wiki.

  • The final query in the file extracts one selected GlaDOS quote and emits it in base64 encoding. Use a base64 decoder to obtain the original multi-media object again.

sql
5scanner.sql

Demonstrates the use of geometric objects and operations on these: scan the contour of a 2D polygon.

Notes:

  • Column shapes of common table expression shapes can contain arbitrary polygons.

  • Variable shape (see \set shape ...) defines the shape id in table shapes whose contour will be output (see below).

  • Adapt variable csv to point to an absolute file path where the contour will be saved in the textual CSV format.

  • If you have GnuPlot installed, run gnuplot scanner.gpl to render the contour in the CSV file in ASCII format. For scanner.gpl, see below.

sql
6scanner.gpl

Companion GnuPlot file to be used with scanner.sql (see above). Adapt the paths in this file to match the variable csv in scanner.sql.

gpl
7arrays.sql
  • A representation of labeled trees and forests using PostgreSQL arrays of type int[] and text[].

sql
8table-functions.sql
  • Table-generating functions (generateseries(), regexpmatches(), ...)

  • User-defined SQL functions (UDFs)

  • LATERAL queries

sql
9finding-seats.sql
  • ACM ICPC task Finding Seats, formulated in SQL
  • "Parses" a string into a rectangular grid using stringtoarray(), unnest() WITH ORDINALITY
  • Uses arrays, table-generating functions, LATERAL to implement a generate-and-test approacch

(Original ACM ICPC task description)

sql
10window-functions.sql
  • SQL Window functions
  • Window frames (ROWS vs RANGE frames, BETWEEN ‹n› PRECEDING AND ‹m› FOLLOWING, ORDER BY, PARTITION BY)
  • WINDOW clause
  • Weekend weather
  • Lexical analysis (nesting depth of expressions)
  • Window functions LAG, LEAD
  • Ascent/descent in the hills
  • Window functions FIRST_VALUE, LAST_VALUE, NTH_VALUE
  • Ranking and numbering inside a partition (ROW_NUMBER, RANK, NTILE, ...)
  • Piece-wise linear approximation of a function

[ updated June 20, 2017, 12:39 ]

sql
11visible.sql

Uses a MAX scan over view angles to determine visibility in a hilly environment.

sql
12visible-left-right.sql

Visibility in a hilly environment while looking left and right (partitioned MAX scan).

sql
13sessionize.sql

Detect streaks of uninterrupted activity (= sessions) in an activity log file.

sql
14run-length-encoding.sql

Uses run-length encoding to compress a pixel map. (Demonstrates row-wise encoding — decoding to be supplied by the students.)

sql
15peaks-valleys.sql

Detect landscape features (peaks, valleys) using patterns of slopes (⭧⭢⭨) in the vicinity. Uses window functions and in-frame row access (FIRST_VALUE, LAST_VALUE, NTTH_VALUE).

sql
16consecutive-ranges.sql

Detect consecutive ranges in an ordered sequence of values (e.g. citations). Uses window function ROW_NUMBER.

sql
17mountain-zones.sql

Classify regions in a mountainous landscape by altitude (vegetation zones, low-/highlands). Uses window functions PERCENT_RANK and NTILE.

sql
18recursion.sql

Demonstrate the use of recursion via WITH RECURSIVE (variants UNION and UNION ALL). Re-implements table function generate_series().

sql
19path-to-root.sql

Given a starting node (e.g., the node named f), find all nodes on the path to the root. Uses array-encoded trees and WITH RECURSIVE.

sql
20connected-components.sql

Find the connected components of an undirected graph. Performs walks from all nodes in the graphs using WITH RECURSIVE.

sql
21regexp.sql

Perform bulk regular expression matching using a deterministic finite state machine. Uses WITH RECURSIVE to recursively process strings.

sql
22sudoku.sql

Finds all solutions for a given incomplete Sudoku puzzle (brute force solver, takes about 500 ms). Uses WITH RECURSIVE to recursively process arrays.

sql
23loose-index-scan.sql

Uses WITH RECURSIVE to implement loose index scans (very efficiently evaluate queries of the form SELECT DISTINCT t.x FROM t).

sql
24k-means.sql

Uses WITH RECURSIVE to implement K-Means clustering via Lloyd's algorithm (with Forgy initialization).

sql
25marching-squares.sql

Uses WITH RECURSIVE to implement the Marching Squares algorithm. Demonstrates the tabular encoding of control flow decisions (control flow → data flow).

sql
26game-of-life.sql

Uses WITH RECURSIVE to implement Conway's Game of Life, a cellular automaton. Execute via

psql -Xq -d ‹database› --set=N=‹N› -f game-of-life.sql

(and uncomment the \set N 40 command inside the file) to flexibly choose the number ‹N› of cell generations to compute).

sql
27fluid-1D.sql

Uses WITH RECURSIVE to imlement a cellular automaton that simulates the flow of water in a 1D rocky landscape. Realizes non-linear recursion through array packing/unpacking in the iterative query.

Writes a CSV file of the simulation data that can be animated by Python script fluid-1D.py (see below).

sql
28fluid-1D.py

Python 3 script that reads the CSV simulation data output by fluid-1D.sql (see above) and visualizes water flow over time. (Press RETURN to start the animation.)

Usage:

psql -d ‹database› -f fluid-1D.sql
python3 fluid-1D.py fluid-1D.csv
py
29cyk.sql

An implementation of the CYK parsing algorithm for context-free grammars in Chomsky normal form. Uses WITH RECURSIVE and employs reinjection of prior iteration results to build a parse tree in layers.

sql
30pl-sql.sql

Demonstrates aspects and constructs of PostgreSQL's PL/pgSQL (a PL/SQL dialect).

sql
31spreadsheet.sql

Implements a minimal spreadsheet core using SQL (including WITH RECURSIVE), SQL UDFs, and PL/SQL.

Uses a JSON-based representation of formulæ and topological sorting to choose a valid cell evaluation order.

sql