SOSP 2025 Tutorial: Open-source SmartNIC abstractions and infrastructure for AI and data analytics

To overcome the performance limitations of CPUs, cloud vendors are increasingly turning to accelerators, such as GPUs. At the same time, emerging big data and ML applications have turned the network into a bottleneck, opening the quest for in- and near-network computing. This trend is specifically met with the deployment of DPUs (BlueField), SmartNICs (AWS Nitro, Meta FBNIC) and FPGA-based platforms (Microsoft AzureBoost, Alibaba Fidas). Due to their stream-like nature and configurability, FPGAs have proven to be excellent prototyping platforms for next-generation systems. However, practically integrating them in realistic systems remains challenging. In this tutorial, we introduce open-source infrastructure for implementing SmartNICs on FPGAs as highly versatile, network-enabled acceleration platforms. Building on our established Coyote shell, this new architecture is focused on in-network processing and advanced network management, but akin to commodity DPUs, also supports local acceleration. We will introduce our infrastructure, showcasing how its high-level abstractions can be used to deploy quantized models on an FPGA from Python, offload various functions (encryption, compression, recommender model pre-processing) to the network datapath, prototype congestion control and, finally, interact with accelerators (e.g. GPU). The goal of our tutorial is to give a live demonstration of accessible prototyping of novel networked computer systems.

Location

Tutorial venue placeholder

The tutorial will be held on 13th October 2025, as part of external page SOSP 2025 in Seoul, South Korea. All the slides will be made available shortly before the tutorial.

Schedule

The first part of the tutorial will aim to provide a high-level introduction and motivation, as well as live demos of the tooling and infrastructure to build SmartNICs on FPGAs. After the coffee break, the second part of the tutorial will focus on more advanced topics such as ML acceleration and congestion control.

  • Benjamin Ramhorst: Introduction to Coyote v2 (40 minutes), showing how to seamlessly deploy an accelerated application on an FPGA in a few lines of C++.  Additionally, examples and live demos of hybrid computer systems with FPGAs and GPUs will be presented.
  • Maximilian Heer: Introduction to BALBOA (20 minutes), giving an overview and live demo of our RDMA stack, RoCE BALBOA. It will showcase how to perform 100G RDMA networking in a few lines of C++ code between two FPGAs as well as a FPGA and a commodity NIC (Mellanox-5). 
  • Coffee break (30 min)
  • Benjamin Ramhorst: Advanced application acceleration (30 min), showcasing how to deploy multi-threaded AES encryption as a standard criptographic technique for NICs as well as low-latency quantized neural networks on FPGAs.
  • Maximilian Heer: Advanced networking (40 min), showcasing how to use FPGA-based SmartNICs for traffic sniffing and congestion control research.
  • QnA, discussion and closing (20 min)

Materials

GitHub repository for the tutorial: external page link

Coyote documentation: external page link

Relavant literature:

  • "Coyote v2: Raising the Level of Abstraction for Data Center FPGAs", B. Ramhorst*, D. Korolija* et al. 2025. [external page link]
  • "Do OS abstractions make sense on FPGAs?", D. Korolija et al., OSDI 2020 [external page link]
  • "ACCL+: an FPGA-Based Collective Engine for Distributed Applications", Z. He et al., OSDI 2024, [external page link]
  • "Machine Learning-based Deep Packet Inspection at Line Rate for RDMA on FPGAs", M. Heer et al., EuroMLSys 2025, [external page link]

Speakers

Prof. Dr. Gustavo Alonso

Gustavo Alonso is a professor in the Department of Computer Science of ETH Zurich where he is a member of the Systems Group and the head of the Institute of Computing Platforms. He leads the external page AMD HACC (Heterogeneous Accelerated Compute Cluster) deployment at ETH, with several hundred users worldwide, a research facility that supports exploring data center hardware-software co-design. His research interests include data management, cloud computing architecture, and building systems on modern hardware. Gustavo holds degrees in telecommunication from the Madrid Technical University and a MS and PhD in Computer Science from UC Santa Barbara. Previous to joining ETH, he was a research scientist at IBM Almaden in San Jose, California. Gustavo has received 4 Test-of-Time Awards for his research in databases, software runtimes, middleware, and mobile computing. He is an ACM Fellow, an IEEE Fellow, a Distinguished Alumnus of the Department of Computer Science of UC Santa Barbara, and has received the Lifetime Achievements Award from the European Chapter of ACM SIGOPS (EuroSys).

 

Benjamin Ramhorst

Benjamin Ramhorst is a second-year doctoral student in the Systems Group at the Department of Computer Science, ETH Zürich. Benjamin obtained his MEng degree from Imperial College London in Electrical and Electronic Engineering, focusing on hardware acceleration for efficient machine learning. During his studies he completed several internships at AMD, CERN and Arm. Benjamin's main research interests are heterogeneous hardware acceleration and distributed computer systems for data processing. More specifically, he is working on data processing through reconfigurable accelerators, both by raising the level of abstractions for infrastructure, through projects such as Coyote and ACCL, as well as custom accelerators for data-intensive tasks, such as neural network inference. Previously, Benjamin published at FPT and OSDI.
 

Maximilian J. Heer

Maximilian J. Heer is a second-year doctoral student at the Systems Group of the Department of Computer Science at ETH Zürich. Before joining ETH, he spent a year as a visiting researcher with the Processor Research Team of the RIKEN Center for Computational Science in Kobe, Japan. Maximilian obtained his MSc degree from both The University of Rhode Island (USA) and the Technical University of Darmstadt in Germany, where he also completed his undergraduate studies in Electrical Engineering. Maximilian's main research interest is in network-attached FPGAs for data processing in heterogeneous environments and large-scale cloud computer systems through projects such as Coyote. More specifically, he is working on FPGA-based NICs for high-performance networks, with research questions ranging from support and enhancement of existing transport protocols over the investigation of advanced congestion control and load balancing schemes to the exploration of compute offloading onto such NICs. Maximilian has published at conferences such as FCCM, IPDPSW, QCE and GECCO.
 

JavaScript has been disabled in your browser