Marko Kabić gives talks on Maximus and MaxBench to NVIDIA and IBM

30.06.2025

Read
Number of comments

Marko Kabić gave talks on Maximus and MaxBench to NVIDIA on 27th May and to IBM Research on 10th June.

Title: Maximus + MaxBench: GPU-Accelerated Data Analytics in the Era of Heterogeneous Hardware

Abstract: Several trends are changing the underlying fabric for data processing in fundamental ways. On the hardware side, machines are becoming heterogeneous with smart NICs, TPUs, DPUs, etc., but specially with GPUs taking a more dominant role. On the software side, the diversity in workloads, data sources, and data formats has given rise to the notion of composable data processing where the data is processed across a variety of engines and platforms. Finally, on the infrastructure side, different storage types, disaggregated storage, disaggregated memory, networking, and interconnects are all rapidly evolving, which demands a degree of customization to optimize data movement well beyond established techniques. To tackle these challenges, in this paper, we present Maximus, a modular data processing engine that embraces heterogeneity from the ground up. Maximus can run queries on CPUs and GPUs, can split execution between CPUs and GPUs, import and export data in a variety of formats, interact with a wide range of query engines through Substrait, and efficiently manage the execution of complex data processing pipelines. On top of Maximus, we’ve built MaxBench: a comprehensive framework designed for benchmarking, profiling, and modeling data analytics workloads on GPUs. We provide a methodological approach to exploring the impact of different combinations of GPU models (RTX3090, A100, H100, GraceHoppers - GH200) and interconnects (i.e., PCIe 3.0, PCIe 4.0, PCIe 5.0, and NVLink 4.0) on relational data analytics workloads. The insights from this analysis reveal the trade-offs between GPU and interconnect efficiencies concerning the communication and computation runtimes. This study also examines future trends by investigating how enhancements in interconnect bandwidth or GPU efficiency could affect performance in the future.