Machine Learning for Computer System Optimization
Many of today’s computer systems use heuristics and hints to make decisions (e.g., to decide which resources to allocate for a task or which data to keep in a cache). As software applications and hardware platforms become more and more heterogeneous, designing heuristics is increasingly difficult. Yet due to growing heterogeneity, automating resource and data management is increasingly important. One promising approach is to learn resource management strategies by training machine learning models using system data collected while profiling or running applications.
Research topics: How can we leverage machine learning models to make systems-level decisions when such decisions often need to be made at microsecond timescales? How should we design APIs to make replacing or supplementing heuristics with machine learning model inference practical in computer systems?