- 2012-11-02: Our DDFP 2013 paper entitled as Algebraic Data Types for Language-Integrated Queries has been accepted for publication.
- 2012-11-01: Our PADL 2013 paper entitled as Analysing the Entire Wikipedia History with Database Supported Haskell has been accepted for publication.
- 2012-03-02: The slide set of the Bringing Back Monad Comprehensions and Extending Database Supported Haskell talk given at Functional Programming Laboratory Seminar in Nottingham is available.
- 2012-01-09: The video of the Bringing Back Monad Comprehensions talk given at Haskell Symposium 2011 in Tokyo is now available.
- 2011-11-18: DSH is available for download on Hackage!
- 2011-06-01: Our Haskell Symposium 2011 paper entitled as Bringing Back Monad Comprehensions has been accepted for publication.
- 2011-04-18: Our work on Ferry wins the Peter Landin Prize 2011 for the best paper at IFL 2010. Thank you, folks.
DSH — Database-Supported Haskell
Database-Supported Haskell, DSH for short, is a Haskell library for database-supported program execution. Using the DSH library, a relational database management system (RDBMS) can be used as a coprocessor for the Haskell programming language, especially for those program fragments that carry out data-intensive and data-parallel computations. Rather than embedding a relational language into Haskell, DSH turns idiomatic Haskell programs into SQL queries.
DSH in the Real World
We have used DSH for large scale data analysis. Specifically, in collaboration with researchers working in social and economic sciences, we used DSH to analyse the entire history of Wikipedia (terabytes of data) and a number of online forum discussions (gigabytes of data).
Because of the scale of the data, it would be unthinkable to conduct the data analysis in Haskell without using the database-supported program execution technology featured in DSH. We have formulated several DSH queries directly in SQL as well and found that the equivalent DSH queries were much more concise, easier to write and maintain (mostly due to DSH’s support for nesting, Haskell’s abstraction facilities and the monad comprehension notation, see below).
One long-term goal is to allow researchers who are not necessarily expert programmers or database engineers to conduct large scale data analysis themselves.
Towards a New Compilation Strategy
As of today, DSH relies on a query com- pilation strategy coined loop-lifting. Loop-lifting comes with important and desirable properties (e.g., the number of SQL queries issued for a given DSH program only depends on the static type of the program’s result). The strategy, however, relies on a rather complex and monolithic mapping of programs to the relational algebra. To remedy this, we are currently exploring a new strategy based on the flattening transformation as conceived by Guy Blelloch. Originally designed to implement the data-parallel declarative language NESL, we revisit flattening in the context of query compilation (which targets database kernels, one particular kind of data-parallel execution environment). Initial results are promising and DSH might switch over in the not too far future. We hope to further improve query quality and also address the formal correctness of DSH’s program-to-queries mapping.
Motivated by DSH we reintroduced the monad comprehension notation into GHC and also extended it for parallel and SQL-like comprehensions. The extension is available in GHC 7.2.
The DSH library and the FerryCore package it uses are available on Hackage (http://hackage.haskell.org/package/DSH). If you have cabal installed on your system you can also install DSH by typing "cabal install DSH" in your terminal.
- Algebraic Data Types for Language-Integrated Queries.
George Giorgidze, Torsten Grust, Alexander Ulrich, and Jeroen Weijers.
In Proceedings of the 1st international workshop on Data Driven Functional Programming, Rome, Italy. ACM, 2013. To appear.
- Analysing the Entire Wikipedia History with Database Supported Haskell.
George Giorgidze, Torsten Grust, Iassen Halatchliyski, and Michael Kummer.
In Proceedings of the 15th international symposium on Practical Aspects of Declarative Languages, Rome, Italy. Springer, 2013. To appear.
- Bringing Back Monad Comprehensions (also available: Slides and Video).
George Giorgidze, Torsten Grust, Nils Schweinsberg, and Jeroen Weijers.
In Proceedings of the ACM SIGPLAN Haskell Symposium (Haskell 2011), Tokyo, Japan. ACM, 2011.
- Haskell Boards the Ferry: Database-Supported Program Execution for Haskell (also available: Slides).
George Giorgidze, Torsten Grust, Tom Schreiber, Jeroen Weijers.
Proceedings of the 22nd Symposium on Implementation and Application of Functional Languages (IFL 2010), Alphen aan den Rijn, Netherlands, September 2010, To be published by Springer LNCS (to appear in 2011), Best Paper Award.