COMPASS Talks

The Computing Platforms Seminar Series (COMPASS) is focused on talks by industry and academia around the general topic of computing platforms.

Lessons Learned from Building a Data Management System for the Cloud

Abstract:

We all observe the impressive capabilities of representation learning and generative models for text, videos, and images on a daily basis. Structured data such as tables in relational databases, however, have long been overlooked despite their prevalence in the organizational data landscape and critical use in high-value applications and decision-making processes. Learned representations, or embeddings, that capture the semantics of structured data can play a key role in making data systems more efficient, robust and accurate, at scale. Models that generalize to real-world databases are critical to make this work. In this context, I will discuss how rather compact and specialized column embeddings can be more effective than using GPT-something for table understanding, and reflect on the importance of capturing the core properties of relational tables in the embedding space. I will close by illustrating the value of embeddings for table retrieval to make LLM-powered query interfaces to structured data truly useful.

 

Bio:

Madelon Hulsebos is a tenure track researcher at CWI in Amsterdam. Prior to that, she was a postdoctoral fellow at UC Berkeley, and received her PhD from the University of Amsterdam for which she did research at MIT and Sigma Computing. Her general research interest is on the intersection of machine learning and data management, currently focusing on Table Representation Learning to democratize insights from structured data. Madelon founded the Table Representation Learning workshop at NeurIPS, and leads various other efforts in this space. She was awarded a BIDS-Accenture fellowship for her postdoctoral research on retrieval systems for structured data at UC Berkeley as well as a 5-year AiNed fellowship grant.


Past COMPASS talks

JavaScript has been disabled in your browser