Algorithms for the TADs and Chromatin loops detection

Keywords: Hi-C; Machine learning; Optimization algorithm; Bioinformatics; DNA looping, Chromatin looping; Topologically Associated Domains (TADs); Chromosome conformation capture
Fall 2015 - present

TAD and Loop Ilustration


The development of chromosomal conformation capture techniques, particularly, the Hi-C technique, has made the analysis and study of the spatial conformation of a genome an important topic in bioinformatics and computational biology. Aided by high-throughput next generation sequencing techniques, the Hi-C technology can generate read pairs that indicate the chromosomal locations within spatial proximity and large-scale intra- and inter-chromosomal interaction occuring within a genome (Lieberman-Aiden et al, 2009). This data can be used to reconstruct 3D structures of chromosomes that can be used to study DNA replication, gene regulation, genome interaction, genome folding, and genome function. This data is called the Hi-C data. Generally, before Hi-C data are used for model construction, they are converted to a matrix form known as a contact matrix or a contact map is a N * N matrix, extracted from a Hi-C data, showing the number of interactions between chromosomal regions. The size of the matrix (N) is the number of equal-size regions of a chromosome. The length of equal-size regions (e.g. 1 Mb base pair) is called resolution. Each entry in the matrix contains a count of read pairs that connect two corresponding chromosome regions in a Hi-C experiment. Therefore, the chromosome contact matrix represents all the observed interactions between the regions (or bins) in a chromosome.

This project focus on the development of algorithms for the detection of Hi-C structural read out Topologically Associated Domains (TADs) and chromatin loops from Hi-C data. TADs are considered to be the structural and functional unit (or module) of a chromosome. According to Dixon et al, 2012, these TADs are unchanged irrespective of cell differentiation, and they also contain gene clusters that are co-regulated. In recent years, the detection of topological domain has become an important problem in bioinformatics, and computational biology, and as a result, several methods for TAD identification have been developed. A chromatin loop occurs when stretches of genomic sequence that lie on the same chromosome (configured in cis) are in closer physical proximity to each other than to intervening sequences. These loops are mostle found in boundary regions on TADs, hence these two strutures are closely related.

This work was supported by UCCS Committee on Research and Creative Works (CRCW) Award: 2020-2022


  1. Higgins, S., Akpokiro, V., Westcott, A. & Oluwadare, O.(2022). TADMaster: a comprehensive web-based tool for the analysis of topologically associated domains. BMC Bioinformatics 23, 463 (2022) . . [@ BMC Bioinformatics ]

  2. Trieu, Tuan*, Oluwatosin Oluwadare*, Julia Wopata, and Jianlin Cheng. GenomeFlow: A Comprehensive Graphical Tool for Modeling and Analyzing 3D Genome Structure. Bioinformatics (2018).(* Co-first author) [@ Bioinformatics]

  3. Oluwadare, Oluwatosin, and Jianlin Cheng. ClusterTAD: an unsupervised machine learning approach to detecting topologically associated domains of chromosomes from Hi-C data. BMC bioinformatics 18.1 (2017): 480. [@ BMC Bioinformatics]


All our algorithms are made public, open-source, and freely accessible to all through our GitHub repository