Data Management Systems

Lecturer: Gustavo Alonso ()

Teaching Assistants

  • Dario Korolija ()
  • Dimitris Koutsoukos ()
  • Fabio Maschi ()
  • Michal Wawrzoniak ()

Lectures

  • Wednesday 10:00 - 12:00 CAB G11
  • Friday 08:00 - 09:00 CAB G11

Exercises

  • Friday 09:00 - 10:00 Online

Announcements/exercises will be handled through moodle.

Lecture schedule

Course contents

The course will cover the implementation aspects of data management systems using relational database engines as a starting point to cover the basic concepts of efficient data processing and then expanding those concepts to modern implementations in data centers and the cloud.

The goal of the course is to convey the fundamental aspects of efficient data management from a systems implementation perspective: storage, access, organization, indexing, consistency, concurrency, transactions, distribution, query compilation vs interpretation, data representations, etc. Using conventional relational engines as a starting point, the course will aim at providing an in depth coverage of the latest technologies used in data centers and the cloud to implement large scale data processing in various forms.

The course will first cover fundamental concepts in data management: storage, locality, query optimization, declarative interfaces, concurrency control and recovery, buffer managers, management of the memory hierarchy, presenting them in a system independent manner. The course will place an special emphasis on understating these basic principles as they are key to understanding what problems existing systems try to address. It will then proceed to explore their implementation in modern relational engines supporting SQL to then expand the range of systems used in the cloud: key value stores, geo-replication, query as a service, serverless, large scale analytics engines, etc.

The main source of information for the course will be articles and research papers describing the architecture of the systems discussed. The list of papers will be provided as the materials for each chapter of the course are released.

Due to the uncertainties created by the Corona virus and the possibility that access to ETH, laboratories, classrooms, etc. might be restricted in the middle of the semester, this edition of the course will have no project or practical component. We will focus on the key architectural aspects and surveying the literature on data management systems architecture. The time that otherwise would have been devoted to programming will be invested instead in looking deeper at how systems are constructed and the algorithms behind many of the optimizations used in real systems. Freed from development work, students are expected to invest the necessary time reading the provided articles and books to gain the necessary understanding of the material.

Reading assignments

Syllabus

Teaching format

Due to the Corona virus situation and the restrictions imposed on room occupancy, the teaching will initially organized as follows:

  • Lectures on Wednesday and Friday will take place in CAB G11. The lectures will be recorded and streamed live (slides and voice). To ensure the room occupancy constraints are respected, students will receive an e-mail with concrete instructions once the final amount of students in the course is known and we can allocate space in the lecture room accordingly. Students who will not be attending the lecture in person will be asked to inform us so that we know the space is available for other students.
  • Exercise sessions will be held online using Zoom. They will be recorded. Occasionally, there could be presentations or sessions that will be held in HG E 1.2. Those sessions will be recorded and streamed live.
  • There will be no project or lab work due to the uncertainties regarding eventual access to labs and computer equipment during the semester. If you are interested in practical work, please contact the TAs and we can suggests a number of projects. These projects will have no bearing on the final grade of the course.
  • Homework will be handled through Moodle.

Exam

The exam will take place on February 1st (Monday) from 15:00 to 18:00 (180 minutes) at ETH Hönggerberg in room HIL C 15. The exam will be a written exam (pen and paper, not through Moodle).

You should have received the information from the office of the rector already but just to be on the safe side, note that there will be a repetition of the exam in the Summer of 2021 (date yet to be decided).

Given the ongoing situation caused by the pandemic, please keep the following instructions in mind:

  • Please read the Directive on the Examination Schedule and the instructions on the Session Examination’s Safety Concept
  • Students have to wear a mask at all times, when arriving to the exam, during the exam, and when leaving the exam. If any student has a medical exception regarding the wearing of masks, please contact us to make adequate arrangements and consult the ETH regulations regarding this matter (medical certificate, negative test less than 48 hours old).
  • Students who are in a risk group should contact us as they will be placed in a special location in the room.
  • Please arrive 30 minutes before the exam as we need to manage the flow of people into the room, with students entering one by one, and being allocated to their sitting places. This will take time and cannot be done in the last minute.

Further instructions will be given at the exam location regarding what to do at the end of the exam, movement during the exam, etc.

JavaScript has been disabled in your browser