Splice Site Prediction using Deep Learning

Keywords: Spice Site; Machine Learning; Deep Learning; Convolutional Neural Network; Ribonucleic Acid, Genome
Spring 2021 - present

EnsembleSplice Modelpipeline


Nucleotide sequences in the messenger Ribonucleic acid (mRNA) that code for any protein are split into non-coding (introns) and the coding region (exons). Before the mRNA takes the DNA information out of the nucleus to be expressed (translated into actual protein), the process of splicing occurs. This process removes the introns in the nucleotide sequence within the gene through an RNA splicing complex enzyme termed spliceosome. A splice site is a point where an exon and an intron intersect. The acceptor splice site is at the intron-exon boundary, which is expressed with consensus Adenine-Guanine in the 5' to 3' orientation (AG). On the other hand, donor-site splice sites are found at the exon-intron border and expressed with consensus AG in the 3' to 5' direction

While Canonical AG/GT splice sites make up nearly all of the splice sites, accurate prediction of splice sites permits alternate splicing prediction. This concept is an important feature of Eukaryotes– it allows eukaryotes to produce different proteins from a single gene. Our research focuses on splice site prediction, as precise splice site localization can significantly contribute to gene structure and function identification and analysis.


  1. Akpokiro, V., Martin, T. & Oluwadare, O.(2022). EnsembleSplice: Ensemble Learning Model for Splice Site Prediction. BMC Bioinformatics 23, 413 (2022) . https://doi.org/10.1186/s12859-022-04971-w. [@ BMC Bioinformatics ]

  2. Akpokiro, V., Oluwadare, O., & Kalita, J. (2021, December). DeepSplicer: An Improved Method of Splice Sites Prediction using Deep Learning. In 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 606-609). IEEE.


All our algorithms are made public, open-source, and freely accessible to all through our GitHub repository