Eugenie Y. Lai


Contact: eylai [at]
GitHub: ey-l
Twitter: @EugenieLai
CV, transcript

News [More Updates]
2021.04 Joining the Data Systems Group (DSG) at MIT EECS CSAIL as a PhD student in Fall '21.

Theme by orderedlist

I’m a 2nd-year PhD student advised by Prof Michael Cafarella in the Data Systems Group at MIT CSAIL. I previously worked with Prof. Rachel Pottinger and Prof. Raymond Ng in the Data Management and Mining Lab at the University of British Columbia.

Today, with the explosion of data, more and more people are in desperate need to access and make use of data in more and more fields, such as social science and healthcare. However, both domain-specific and programming skills are required for an individual to do so, but not everyone has the background. Inspired by such use cases, my current research focuses on developing methods to help users interact with and make sense of data using machine learning and programming language techniques.

On-Going Projects

Automatic Standardization for Data Preprocessing

Data preparation has been seen as “janitor work” yet essential in data-to-insight pipelines. The increasing liberality of data is followed by an explosion in the diversity of data consumers. However, the required technical and domain expertise prevents many from performing extensive data preparation. Further, many seem to be stuck in a vicious cycle of writing one-off programs to process data. Recently, automating data preparation programs has been shown to improve many aspects of the pipeline, including data quality, research reproducibility, and user productivity. We propose a novel approach to automatically improve data preparation programs in Python.


Workload-Aware Query Recommendation Using Deep Learning To Appear in IEEE EDBT ‘23.
Eugenie Y Lai, Zainab Zolaktaf, Mostafa Milani, Omar AlOmeir, Jianhao Cao, Rachel Pottinger

Summarizing Provenance of Aggregation Query Results in Relational Databases IEEE ICDE ‘21: 1955-1960. Omar AlOmeir, Eugenie Lai, Mostafa Milani, and Rachel Pottinger.

Pastwatch: On the Usability of Provenance Data in Relational Databases IEEE ICDE ‘20: 1882-1885.
Omar AlOmeir, Eugenie Lai, Mostafa Milani, and Rachel Pottinger.


Blog for fun.

Miscellaneous to de-stress.