Avatar

Da-Inn Erika Lee

PhD Student in Biomedical Data Science

Roy Lab


I am a PhD student in Biomedical Data Science at the University of Wisconsin - Madison, working in Sushmita Roy's lab. I develop and implement machine learning algorithms to study how the DNA folds itself inside the cell. More specifically, I use matrix factorization and multi-task learning to find the structural units of 3D genome organization, and how they change under different cell lineages, diseases, and species.

Interests

  • Dimension Reduction
  • Graph embedding
  • Probabilistic Graphical Models
  • 3D Genome
  • Single-cell omics
  • Microbiome
  • Electronic Health Records

Education

  • PhD in Biomedical Data Science, 2024

    University of Wisconsin, Madison

  • MSc in Computer Sciences, 2017

    University of Wisconsin, Madison

  • BSc in Cellular Molecular Biology, 2011

    University of Michigan, Ann Arbor

Skills

Programming Languages

Python
C/C++
Java
R
MATLAB
Julia

Open Reproducible Science

Git
Conda
Jupyter Notebook
Snakemake
Docker

Database & Systems

Linux
SQL (Oracle, MS)
AWS
Hadoop
Spark

Experience

 
 
 
 
 
Sep 2018 – Present
Madison, WI

Research Assistant

Roy Lab @ Wisconsin Institute for Discovery

I developed machine learning algorithms to discover the building blocks of 3D genome organization and to identify changes in 3D genome across developmental or disease trajectories. In addition to my core research, I acted as:

  • Presenter, workshop host, and panelist at national and international research conferences
  • Mentor for elementary school to graduate students interested in computer science or research
  • Reviewer for various scientific publication and conferences
  • Guest lecturer and grader for a gradute course on computational network biology
 
 
 
 
 
Oct 2014 – Aug 2018
Madison, WI

Senior Analytics Consultant

University of Wisconsin Hospital & Clinics

I specialized in creating low-maintenance database objects and intuitive dashboards on key performance metrics for pharmacy, surgery, anesthesia, and emergency departments.

  • Developed 9 QlikView dashboards consistently among top in unique user visits
  • Overhauled enterprise database objects to lower technical debt
  • Managed three stakeholder groups and their request queues
  • Led datathon to analyze operating margin decline
  • Designed computer-based internal developer training
 
 
 
 
 
Jun 2011 – Jul 2013
Verona, WI

Quality Assurance

Epic

As a lead of the reporting module in Epic’s surgical application QA team, here were my major contributions:

  • Designed & tested KPI reporting package in a major release cycle
  • Tested flagship data warehouse product for initial release
  • Managed development project for first major UK customer
  • Supported EMR go-live at 12 different health care organizations
  • Taught new-hire application fundamentals & configuration class

Awards & Honors

Predoctoral fellowship in Computation and Informatics in Biology and Medicine (CIBM)

Honorable Mention in Morgridge Institute for Research Ethics Cartooning Contest

Student Research Grants Competition (SRGC) Travel Award

Honorable Mention for Best Talk at GLBIO

Student Research Grants Competition (SRGC) Travel Award

Hopwood Underclassmen Fiction Award

Talks

Linking dynamics of 3D genome organization to cardiac development & disease

Three-dimensional (3D) genome organization, which determines how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular processes. …

Detecting dynamic 3D genome organization with multi-task matrix factorization

The three-dimensional (3D) organization of the genome, which determines how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular function …

Detecting higher-order structural changes in 3D genome organization with multi-task matrix factorization

Three-dimensional (3D) genome organization, which determines how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular processes. …

Detecting higher-order structural changes from 3D genome organization data

Three-dimensional (3D) genome organization, which determines how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular processes. …

Detecting higher-order structural changes in 3D genome organization with multi-task matrix factorization

Three-dimensional (3D) genome organization, or how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular function and malfunction. …

Discovering structural units of chromosomal organization with matrix factorization and graph regularization

The three-dimensional (3D) organization of the genome is an important layer of regulation in developmental, disease, and evolutionary processes. Hi-C is a high-throughput …

Discovering structural units of chromosomal organization with matrix factorization and graph regularization

Three dimensional organization of the genome is emerging as an important determinant of cell-type specific expression and is implicated in many diseases, including cancer (Bouwman …

A graph-regularized non-negative matrix factorization method to discover organizational units of chromosomes

Three dimensional organization of the genome is emerging as an important determinant of cell-type specific expression and is implicated in many diseases, including cancer. Hi-C is …

It takes a village to raise a dashboard

How a distributed stakeholder model empowers self-service analytics

Posters

Detecting higher-order structural changes in 3D genome organization with multi-task matrix factorization

Three-dimensional (3D) genome organization, which determines how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular processes. …

How the DNA folds in the Nucleus: a machine learning approach

A poster introducing 3D genome and matrix factorization to a general audience

GRINCH: Discovering structural units of chromosomes with graph-regularized matrix factorization

Three dimensional organization of the genome is emerging as an important determinant of cell-type specific expression and is implicated in many diseases, including cancer (Bouwman …

OR KPI self-service reporting

With operating rooms being one of the top revenue sources in the acute-care setting, it is crucial to monitor key performance indicators (KPIs) such as case volume, room turnover, …

Self-service reporting for quantitative provider practice evaluation

A critical component of quality improvement is equipping providers with meaningful information regarding their practices. This requires not only obtaining accurate data but also …

Teaching and Tutorials

Guest Lecture in Special Topics in Computational Network Biology (BMI826/CS838)

Hands-on demonstration and coding exercises for integrating multiple single cell gene expression datasets

Higher Understanding with Lower Dimensions

Tutorial on dimension reduction methods at the Great Lakes Bioinformatics (GLBIO) conference

WACM Explains: Machine Learning

UW Madison's Women in Association for Computing Machinery (WACM) workshop on basics of Machine Learning

Outreach and Mentoring

Panelist on 'The Road Forward for Informatics: Future Directions for the Next Five Years in our Field' at the 2022 NLM Informatics Training Conference

Part of a graduate student panel leading an open-ended discussion; answered questions from program director, faculty, and other students about diversity, ethics, professional development, institutional support needs in bioinformatics research and training

Mentor for 2 graduate rotation students in Roy Lab

Provide guidance for literature review; weekly or biweekly meetings to review experiment plan, results, and progress; provide feedback on lab meeting presentations

Mentor for Wisconsin Science and Computing Emerging Research Stars (WISCERS)

Mentor and panelist for undergraduate students underpresented in computer sciences research

Mentor for Women in Scientific Education and Research (WISER)

Mentor and panelist for undergraduate women interested in graduate school and STEM research

Election official and voter education ambassador for the City of Madison

Due to the pandemic, there is a nation-wide shortage of poll workers and confusion around in-person and mail-in voting logistics. Consider getting educated and involved!

Mentor for Maydm middle school programs

Mentor for students in Maydm STEM camps: STEM Power is Girl Power & Wonderful World of Web Development

Scratch coding club leader at Falk Elementary School

Weekly after-school club to spark interest in computer science in elementary school students

Mentor for undergraduate research project

Defined project scope for applying GRiNCH to Hi-C data from rat model of breast cancer susceptibility; provided weekly milestone review and feedback on progress/results

Big Data, Big Opportunities

Facebook live event for promoting career opportunities at UW Health Enterprise Analytics

Mentor for WACM undergraduate students

Weekly office hours for career path discussion and review of resumes and/or graduate school applications; weekly newsletter with academic and career-related resources

Projects

*

Chicago L Train Ridership Pattern

Where are the high-growth, demographically suitable areas for a new spinning club franchise?

Convolutional neural network (CNN) for predicting Hi-C interaction counts

Can deep learning model predict which regions of our genetic code influence each other?

Greedy approach to maximizing gradient diversity for minibatch SGD

How do we scale distributed gradient descent to a large batch size?

GRiNCH: Graph-Regularized NMF and Clustering for Hi-C

How do we find structural units of DNA from its 3D conformation inside the cell?

Higher Understanding with Lower Dimensions

Dimension Reduction Tutorial at GLBIO 2019

Integrating single-cell gene expression datasets

What are the groups of driver genes in each stage of cell reprogramming?

Multiview NMF (MVNMF)

How do we identify genomic regions responsible for dynamic 3D organization of the DNA during development?

Predicting epidemiological trends with cellular automata

How can we predict flu outbreaks by looking at our own and our neighbor’s history?

Tree-Guided Integrated Factorization (TGIF)

What are the persistent and dynamic structural elements of the 3D genome during a dynamic biological process?

WACM Explains: Machine Learning

Women in Association for Computing Machinery (WACM) workshop on fundamentals of machine learning

Contact