COMPASS Talks
The Computing Platforms Seminar Series (COMPASS) is focused on talks by industry and academia around the general topic of computing platforms.
NCCL Communication Offload from GPU SMs
Abstract:
NCCL is a widely used GPU communication library for large-scale AI workloads. A well-known challenge in NCCL is its high utilization of GPU streaming multiprocessors (SMs) during communication, which can limit compute resources available for AI kernels. This issue becomes increasingly critical as workloads grow more compute intensive. In this talk, I will first provide a brief introduction to NCCL and its communication model. Then, I will present recent advances that offload communication tasks from the SMs, improving overall GPU resource utilization and enabling better overlap between computation and communication.
Bio:
Zhenhao He is a senior software engineer at NVIDIA working on NCCL since November 2024. Before joining NVIDIA, he conducted systems research at ETH Zurich, focusing on hardware acceleration for networking and data processing. He holds both a PhD and a master’s degree from ETH Zurich.