Computing Platforms

Content

The seminar will cover core concepts and ideas in the general area of computer systems, ranging from software and hardware architectures to system design for operating systems, data processing systems, and distributed systems. The focus will be on fundamental ideas that apply across systems and application areas but with an emphasis on those ideas that apply to cloud platforms and hardware accelerators.

Format

The seminar will consist on student presentations based on a list of papers that will be provided at the beginning of the course. Presentations will be done in teams. Presentations will be arranged in slots of 30 minutes talk plus 15 minutes questions. Grades will be assigned based on quality of the presentation, coverage of the topic including material not in the original papers, participation during the seminar, and ability to understand, present, and criticize the underlying technology.

Seminar Hours

Mondays, 4-6pm, at CHN D 44. The first seminar (Feb 21) will be online. 

external pageZoom link

Lecturer

  • Prof. Gustavo Alonso

Teaching Assistants

  • Dr. Michael Giardino
  • Dr. Michal Friedman
  • Dr. Runbin Shi

Schedule

Papers

You may need to click on the links from within the ETH network (via VPN) to get the full-text papers.

System Design

NVM:

1. Xu, J., & Swanson, S. (2016). NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In: FAST. [external pagelink]

2. Coburn, J., Caulfield, A. M., Akel, A., Grupp, L. M., Gupta, R. K., Jhala, R., & Swanson, S. (2011). NV-Heaps: Making Persistent Objects Fast and Safe with Next-Generation, Non-Volatile Memories. In: SIGARCH Comput. Archit. News. [external pagelink]

3. Raybuck, A., Stamler, T., Zhang, W., Erez, M., & Peter, S. (2021). HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM. In: SOSP. [external pagelink]

4. Bhandari, K., Chakrabarti, D. R., & Boehm, Hans-J. (2016). Makalu: Fast Recoverable Allocation of Non-Volatile Memory. In: OOPSLA. [external pagelink]

5. Qureshi, M. K., Srinivasan, V., & Rivers, J. A. (2009). Scalable High Performance Main Memory System using Phase-Change Memory Technology. In: ISCA. [external pagelink]

Hardware Acceleration:

6. Lee, S., Yu, Y., Tang, Y., Khandelwal, A., Zhong, L., & Bhattacharjee, A. (2021). MIND : In-Network Memory Management for Disaggregated Data Centers. In: SOSP. [external pagelink]

7. Kim, J., Jang, I., Reda, W., et al. (2021). LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism. In: SOSP. [external pagelink]

8. Jouppi, N. P., Young, C., Patil, N., et al. (2017). In-Datacenter Performance Analysis of a Tensor Processing Unit. In: ISCA. [external pagelink]

9. Ranganathan, P., Stodolsky, D., Calow, J., et al. (2021). Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild. In: ASPLOS. [external pagelink]

Latency in Cloud Systems:

10. Barroso, L., Marty, M., Patterson, D., & Ranganathan, P. (2017). Attack of the killer microseconds. In: CACM, 60(4). [external pagelink]

11. Primorac, M., Bugnion, E., & Argyraki, K. (2017). How to measure the killer microsecond. In: CCR, 47(5). [external pagelink]

12. Delimitrou, C., & Kozyrakis, C. (2018). Amdahl’s law for tail latency: Queueing theoretic models can guide design trade-offs in systems targeting tail latency, not just average performance. In: CACM, 61(8). [external pagelink]

13. Klimovic, A., Kozyrakis, C., Thereska, E., John, B., & Kumar, S. (2016). Flash storage disaggregation. In: EuroSys. [external pagelink]

14. Marty, M., de Kruijf, M., Adriaens, J., et al. (2019). Snap: a microkernel approach to host networking. In: SOSP 2019. [external pagelink]

15. Dalton, M., Schultz, D., Adriaens, J., et al. (2018). Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization. In: NSDI. [external pagelink]

Data Processing in the Cloud

16. Firestone, D., Putnam, A., Mundkur, et al. (2018). Azure Accelerated Networking: SmartNICs in the Public Cloud Azure Accelerated Networking: SmartNICs in the Public Cloud. In: NSDI. [external pagelink]

17. Corbett, J. C., Dean, J., Epstein, M., et al. (2012). Spanner: Google’s Globally-Distributed Database. In: OSDI. [external pagelink]

18. Bacon, D. F., Bales, N., Bruno, N., et al. (2017). Spanner: Becoming a SQL system. In: SIGMOD. [external pagelink]

19. Lakshman, A., & Malik, P. (2010). Cassandra: a decentralized structured storage system. In: SIGOPS Review, 44(2). [external pagelink1] [external pagelink2]

20. Dageville, B., Huang, J., Lee, A. W., et al. (2016). The Snowflake Elastic Data Warehouse. In: SIGMOD. [external pagelink]

21. Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., & Chun, B.-G. (2015). Making Sense of Performance in Data Analytics Frameworks. In: NSDI. [external pagelink]

22. Burrows, M. (2006). The Chubby lock service for loosely-coupled distributed systems. In: OSDI. [external pagelink]

23. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, et al. (2007). Dynamo: Amazon’s Highly Available Key-value Store. In: SIGOPS. [external pagelink]

24. Shafer, J., Rixner, S., & Cox, A. L. (2010). The Hadoop distributed filesystem: Balancing portability and performance. In: ISPASS. [external pagelink1] [external pagelink2]

25. Beaver, D., Kumar, S., Li, H. C., Sobel, J., & Vajgel, P. (2010). Finding a needle in haystack: Facebook's photo storage. In: OSDI. [external pagelink]

26. Armbrust, M., Ghodsi, A., Zaharia, M., et al. (2015). Spark SQL: Relational Data Processing in Spark. In: SIGMOD. [lexternal pageink]

27. Hunt, P., Konar, M., Junqueira, F. P., & Reed, B. (2019). ZooKeeper: Wait-free coordination for internet-scale systems. In: USENIX ATC. [external pagelink]

28. Chen, G. J., Wiener, J. L., Iyer, S., Jaiswa, et al. (2016). Realtime Data Processing at Facebook. In: SIGMOD. [external pagelink]

29. Hellerstein, J. M., Faleiro, J., Gonzalez, et al. (2019). Serverless Computing: One Step Forward, Two Steps Back. In: CIDR. [external pagelink]

30. Shankar, V., Krauth, K., Vodrahalli, K., Pu, Q., et al. (2020). Serverless linear algebra. In: SoCC. [external pagelink]

31. Klimovic, A., Wang, Y., Stuedi, P., et al. (2018). Pocket: Elastic Ephemeral Storage for Serverless Analytics. In: OSDI. [external pagelink]

32. Müller, I., Marroquín, R., & Alonso, G. (2020). Lambada: Interactive Data Analytics on Cold Data Using Serverless Cloud Infrastructure. In: SIGMOD. [external pagelink]

 

Presentations Tips

 

JavaScript has been disabled in your browser