Jan is an alumnus of our group.
This page is no longer maintained since Jan has joined SAP as a developer in September 2011.
Since 2003 I'm working on the XQuery compiler Pathfinder. Pathfinder compiles arbitrary XQuery expressions into an 1NF table algebra while faithfully maintaining XQuery semantics.
In the last years we applied the ideas of Pathfinder's compilation strategy in the context of the Ferry project also to various other source languages. A further use case of these ideas is our SQL-to-SQL compiler that drives the true language-level SQL debugger Habitat.
To get an overview of my publications, please visit our publications page, my DBLP entry or Google Scholar's result.
Research on Pathfinder
During my internship at the CWI in 2004 I built the first back-end specific code generator for the MonetDB database system. The combination of this Pathfinder code generator and MonetDB have become known as ``MonetDB/XQuery''. Special bulk-oriented algorithms for XPath step processing (based on the Staircase-Join algorithms) and a syntactical value join recognition turned MonetDB/XQuery into one of the fastest XQuery processors.
Since 2005 I'm building and refining the table algebra of Pathfinder. Based on the algebraic representations our group (and others) build specific back-end adapters for SQL:99-speaking systems, MonetDB, kdb+, and Natix.
In parallel I'm building (and extending) a peephole-style optimizer for the Pathfinder algebra that—in contrast to optimizers in standard SQL systems that have real trouble making use of the unoptimized plans—allows all back-ends to benefit from the performance improvements hiding in the DAG-shaped algebraic plans. A large number of properties such as e.g., constantness, key information, column usage analysis, and functional dependencies and a large set of rewrite rules e.g., leads to ``join graph'' plans where order information and duplicate elimination are restricted to the result, but do not occur inside the evaluation plan anymore.
The effectiveness of the optimizer can be inspected in the plans at the bottom of the page: The unoptimized plan of XMark query Q8 (left) in comparison to the optimized variant (right) forces the back-end to choose a serial evaluation technique without a chance to apply value-based joins. The optimized variant of Q8 in constrast is a lot smaller, works without explicit intermediate ordering operators (look for red operators in the plans), and runs on all back-ends orders of magnitudes faster.
Unoptimized and Optimized variant of XMark query Q8
let $auction := doc("auction.xml") return
for $p in $auction/site/people/person
let $a :=
for $t in $auction/site/closed_auctions/closed_auction
where $t/buyer/@person = $p/@id