Eugenie Y. Lai

Logo

Email: eugenie.y.lai [at] gmail.com
GitHub: eugenieshine
Twitter: @EugenieLai
CV, transcript, timeline

I am a senior undergraduate student in the Combined Major of Business and Computer Science (BUCS) program at the University of British Columbia (UBC) Sauder School of Business. I am currently a full-time research assistant at the UBC Data Management and Mining Lab, supervised by Dr. Rachel Pottinger. Last summer, I was supervised by Dr. Raymond Ng in the UBC Data Science for Social Good (DSSG) program.

My current research focuses on databases while applying concepts of visualization and machine learning to help users interact with and make sense of data. Today, database systems provide a vital infrastructure for users to access high volumes of data in a variety of applications. However, both field-specific and database-related expertise are required for a user to interact with such database applications. Seeing the user-database barriers sparks my urge to centre my work around the theme of facilitating user interaction with databases, especially in knowledge exploration.

My pronouns are she/her/hers.

NEWS: Check out my post on how SIGMOD 2020 changed my view on my research interests and grad studies.

Research Projects

Sequence-Aware Query Recommendation Using Deep Learning

Users interact with databases management systems by writing sequences of queries. Those sequences encode important information. Current SQL query recommendation approaches do not take that sequence into consideration. Our work presents a novel sequence-aware approach to query recommendation. We use deep learning prediction models trained on query sequences extracted from large-scale query workloads to build our approach. We present users with contextual (query fragments) and structural (query templates) information that can aid them in formulating their next query. We thoroughly analyze query sequences in two real-world query workloads, the Sloan Digital Sky Survey (SDSS) and the SQLShare workload. Empirical results show that the sequence-aware, deep-learning approach outperforms methods that do not use sequence information. [Submitted to VLDB ‘21] [Manuscript]

PastWatch

Pastwatch helps users understand query answers by summarizing, explaining, and visualizing query provenance. Data provenance is any information about the origin of data and the process that leads to its creation. The provenance of a query over a database is a collection of the data in the database that contributed to the query answer. While comprehensive, query provenance remains large and overwhelming. Hence the burden is on users to query and explore query results via different data manipulation languages. Our system helps users explore and interact with their database by providing novel insights into their query results using query provenance.

Publications

Summarizing Provenance of Aggregation Query Results in Relational Databases [Short Paper]
Omar AlOmeir, Eugenie Y. Lai, Mostafa Milani, and Rachel Pottinger
To Appear in IEEE International Conference on Data Engineering, 2021 (ICDE ‘21).

Pastwatch: On the Usability of Provenance Data in Relational Databases [Short Paper]
Omar AlOmeir, Eugenie Y. Lai, Mostafa Milani, and Rachel Pottinger
IEEE International Conference on Data Engineering, 2020 (ICDE ‘20): 1882-1885.

Blog

I write about things I did on the way to discover my research interests.

Miscellaneous

This is my favourate way to destress.