Detecting higher-order structural changes in 3D genome organization with multi-task matrix factorization

Abstract

Three-dimensional (3D) genome organization, which determines how the DNA is packaged inside the nucleus, has emerged as a key regulatory mechanism of cellular processes. High-throughput chromosomal conformation capture (Hi-C) technologies have enabled the study of 3D genome organization by experimentally measuring interactions among genomic regions in 3D space. Analysis of Hi-C data has revealed higher-order organizational units such as topologically associating domains (TADs). Changes or disruptions to such units have been associated with disease, development, and evolution. Therefore, a key problem is to systematically detect higher-order structural changes across Hi-C datasets from multiple conditions. Existing computational methods either do not model higher-order structural units or only compare pairs of Hi-C datasets. We address these limitations with Tree-Guided Integrated Factorization (TGIF), a new multi-task Non-negative Matrix Factorization (NMF) approach. TGIF models complex relationships among multiple Hi-C datasets as a tree such that closely related Hi-C datasets have similar lower-dimensional representation. TGIF provides a statistically significant set of differential TAD boundaries with higher precision than existing approaches. Application to a cardiomyocyte differentiation timecourse dataset identified time-point specific TAD boundaries overlapping a retrotransposon element previously shown to be important for cell fate specification in humans and apes.

Date
Jul 14, 2022
Event
Intelligent Systems for Molecular Biology (ISMB)
Location
Madison, WI

Related