Computing Platforms
Content
The seminar will cover core concepts and ideas in the general area of computer systems, ranging from software and hardware architectures to system design for operating systems, data processing systems, and distributed systems. The focus will be on fundamental ideas that apply across systems and application areas but with an emphasis on those ideas that apply to cloud platforms and hardware accelerators.
Format
The seminar will consist on student presentations based on a list of papers that will be provided at the beginning of the course. Presentations will be done in teams. Presentations will be arranged in slots of 30 minutes talk plus 15 minutes questions. Grades will be assigned based on quality of the presentation, coverage of the topic including material not in the original papers, participation during the seminar, and ability to understand, present, and criticize the underlying technology.
Seminar Hours
Mondays, 4-6pm, at CHN D 44. The first seminar (Feb 21) will be online.
Lecturer
- Prof. Gustavo Alonso
Teaching Assistants
- Dr. Michael Giardino
- Dr. Michal Friedman
- Dr. Runbin Shi
Schedule
Papers
You may need to click on the links from within the ETH network (via VPN) to get the full-text papers.
System Design
NVM:
1. Xu, J., & Swanson, S. (2016). NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In: FAST. [external page link]
2. Coburn, J., Caulfield, A. M., Akel, A., Grupp, L. M., Gupta, R. K., Jhala, R., & Swanson, S. (2011). NV-Heaps: Making Persistent Objects Fast and Safe with Next-Generation, Non-Volatile Memories. In: SIGARCH Comput. Archit. News. [external page link]
3. Raybuck, A., Stamler, T., Zhang, W., Erez, M., & Peter, S. (2021). HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM. In: SOSP. [external page link]
4. Bhandari, K., Chakrabarti, D. R., & Boehm, Hans-J. (2016). Makalu: Fast Recoverable Allocation of Non-Volatile Memory. In: OOPSLA. [external page link]
5. Qureshi, M. K., Srinivasan, V., & Rivers, J. A. (2009). Scalable High Performance Main Memory System using Phase-Change Memory Technology. In: ISCA. [external page link]
Hardware Acceleration:
6. Lee, S., Yu, Y., Tang, Y., Khandelwal, A., Zhong, L., & Bhattacharjee, A. (2021). MIND : In-Network Memory Management for Disaggregated Data Centers. In: SOSP. [external page link]
7. Kim, J., Jang, I., Reda, W., et al. (2021). LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism. In: SOSP. [external page link]
8. Jouppi, N. P., Young, C., Patil, N., et al. (2017). In-Datacenter Performance Analysis of a Tensor Processing Unit. In: ISCA. [external page link]
9. Ranganathan, P., Stodolsky, D., Calow, J., et al. (2021). Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild. In: ASPLOS. [external page link]
Latency in Cloud Systems:
10. Barroso, L., Marty, M., Patterson, D., & Ranganathan, P. (2017). Attack of the killer microseconds. In: CACM, 60(4). [external page link]
11. Primorac, M., Bugnion, E., & Argyraki, K. (2017). How to measure the killer microsecond. In: CCR, 47(5). [external page link]
12. Delimitrou, C., & Kozyrakis, C. (2018). Amdahl’s law for tail latency: Queueing theoretic models can guide design trade-offs in systems targeting tail latency, not just average performance. In: CACM, 61(8). [external page link]
13. Klimovic, A., Kozyrakis, C., Thereska, E., John, B., & Kumar, S. (2016). Flash storage disaggregation. In: EuroSys. [external page link]
14. Marty, M., de Kruijf, M., Adriaens, J., et al. (2019). Snap: a microkernel approach to host networking. In: SOSP 2019. [external page link]
15. Dalton, M., Schultz, D., Adriaens, J., et al. (2018). Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization. In: NSDI. [external page link]
Data Processing in the Cloud
16. Firestone, D., Putnam, A., Mundkur, et al. (2018). Azure Accelerated Networking: SmartNICs in the Public Cloud Azure Accelerated Networking: SmartNICs in the Public Cloud. In: NSDI. [external page link]
17. Corbett, J. C., Dean, J., Epstein, M., et al. (2012). Spanner: Google’s Globally-Distributed Database. In: OSDI. [external page link]
18. Bacon, D. F., Bales, N., Bruno, N., et al. (2017). Spanner: Becoming a SQL system. In: SIGMOD. [external page link]
19. Lakshman, A., & Malik, P. (2010). Cassandra: a decentralized structured storage system. In: SIGOPS Review, 44(2). [external page link1] [external page link2]
20. Dageville, B., Huang, J., Lee, A. W., et al. (2016). The Snowflake Elastic Data Warehouse. In: SIGMOD. [external page link]
21. Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., & Chun, B.-G. (2015). Making Sense of Performance in Data Analytics Frameworks. In: NSDI. [external page link]
22. Burrows, M. (2006). The Chubby lock service for loosely-coupled distributed systems. In: OSDI. [external page link]
23. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, et al. (2007). Dynamo: Amazon’s Highly Available Key-value Store. In: SIGOPS. [external page link]
24. Shafer, J., Rixner, S., & Cox, A. L. (2010). The Hadoop distributed filesystem: Balancing portability and performance. In: ISPASS. [external page link1] [external page link2]
25. Beaver, D., Kumar, S., Li, H. C., Sobel, J., & Vajgel, P. (2010). Finding a needle in haystack: Facebook's photo storage. In: OSDI. [external page link]
26. Armbrust, M., Ghodsi, A., Zaharia, M., et al. (2015). Spark SQL: Relational Data Processing in Spark. In: SIGMOD. [lexternal page ink]
27. Hunt, P., Konar, M., Junqueira, F. P., & Reed, B. (2019). ZooKeeper: Wait-free coordination for internet-scale systems. In: USENIX ATC. [external page link]
28. Chen, G. J., Wiener, J. L., Iyer, S., Jaiswa, et al. (2016). Realtime Data Processing at Facebook. In: SIGMOD. [external page link]
29. Hellerstein, J. M., Faleiro, J., Gonzalez, et al. (2019). Serverless Computing: One Step Forward, Two Steps Back. In: CIDR. [external page link]
30. Shankar, V., Krauth, K., Vodrahalli, K., Pu, Q., et al. (2020). Serverless linear algebra. In: SoCC. [external page link]
31. Klimovic, A., Wang, Y., Stuedi, P., et al. (2018). Pocket: Elastic Ephemeral Storage for Serverless Analytics. In: OSDI. [external page link]
32. Müller, I., Marroquín, R., & Alonso, G. (2020). Lambada: Interactive Data Analytics on Cold Data Using Serverless Cloud Infrastructure. In: SIGMOD. [external page link]
Presentations Tips