COMPASS Talks
The Computing Platforms Seminar Series (COMPASS) is focused on talks by industry and academia around the general topic of computing platforms.
Bring Your Own Kernel! — Constructing High-Performance Data Management Systems from Components
Abstract:
Modern Data Management Systems increasingly abandon monolithic architectures in favor of compositions of specialized components. Storage layers like Parquet and Arrow are combined with kernels like Velox and Apache DataFusion, optimizers like Apache Calcite and Orca and other specialized components to build systems optimized for a specific domain, execution environment or even application.
Unfortunately, the architecture of Data Management Systems and the interfaces between components are the same as 30 years ago: highly efficient but rigid. This rigidity obstructs the adoption of novel ideas and techniques such as hardware acceleration, adaptive processing, learned optimization, or serverless execution in real-world systems.
To address this impasse, my group at Imperial has developed a novel approach to data management system composition inspired by two principles stemming from compiler-construction research: a homoiconic representation of data and code and partial evaluation of queries by components. I present an implementation of the approach in a new system called BOSS and illustrate how BOSS achieves a fully composable design that effectively combines different data models, hardware platforms and processing engines. I will demonstrate how this design allowed my group to implement features like GPU acceleration of relational queries and generative data imputation in weeks (rather than years) in a system without (measurable) overhead compared to a monolithic design. I will further illustrate how BOSS' design enables some cutting edge research my group is doing on topics as varied as compute/storage-disaggregation, microadaptive processing and compression.
Bio:
Holger is an Associate Professor (“Senior Lecturer” in old English terms) in the Large-Scale Data and Systems group at Imperial College London. He is interested in all things data: analytics, transactions, systems, algorithms, data structures, processing models and everything in between. While some of his work targets "traditional" relational databases, the objective is to broaden the applicability of data management techniques. To this end, Holger studies “Composable Database Systems”: systems that are extensible to heterogeneous workloads, data models and hardware. This naturally leads to research at the intersection of data management, compilers and computer architecture, targeting applications like Generative Modeling, Graph Processing as well as “classic” Data Analytics. Before joining Imperial, he was a Postdoc in the Database group at MIT CSAIL, a PhD student in the Database Architectures group at CWI in Amsterdam and an Undergraduate Student in computer science at the Humboldt-Universität of Berlin. Holger knows how to speak and write, as evidenced, respectively, by a CIDR Gong Show Award and a VLDB Best Paper Award.