Machine Learning for High-Impact Applications: Pattern Recognition in Mammalian DNA Motifs and Human Gene Editing Off-Target Predictions

Speaker:  Ka-Chun Wong – Kowloon Tong, Hong Kong
Topic(s):  Applied Computing

Abstract

In this talk, I will present my research group’s recent advances in machine learning for biological complex systems. In particular, we will focus on two pattern recognition tasks in genomics. The first task is to elucidate DNA motif patterns from big sequence data while the second task is to predict the off-targets of CRISPR-Cas9 gene editing using deep learning.
 
(First Task) In higher eukaryotes, protein-DNA binding interactions are the central activities in gene regulation. In particular, DNA motifs such as transcription factor binding sites are the key components in gene transcription. Harnessing the recently available chromatin interaction data, computational methods are desired for identifying the coupling DNA motif pairs enriched on long-range chromatin-interacting sequence pairs (e.g. promoter-enhancer pairs) systematically. To fill the void, a novel probabilistic model (namely, MotifHyades) is proposed and developed for de novo DNA motif pair discovery on paired sequences. In particular, two expectation maximization algorithms are derived for efficient model training with linear computational complexity for big sequence data. [1]
 
(Second Task) The prediction of off-target mutations in CRISPR-Cas9 is a hot topic due to its relevance to human gene editing research. Existing prediction methods have been developed; however, most of them just calculated scores based on mismatches to the guide sequence in CRISPR-Cas9. Therefore, the existing prediction methods are unable to scale and improve their performance with the rapid expansion of experimental data in CRISPR-Cas9. Moreover, the existing methods still cannot satisfy enough precision in off-target predictions for gene editing at the clinical level. To address it, we design and implement two algorithms using deep neural networks to predict off-target mutations in CRISPR-Cas9 gene editing (i.e. deep convolutional neural network and deep feedforward neural network). [2]
 
If time is permitted, I will discuss the latest breakthroughs in machine learning for other high-impact applications made by my research group and others.
 
References
[1]        Ka-Chun Wong. (*Sole Authorship 2017). "MotifHyades: expectation maximization for de novo DNA motif pair discovery on paired sequences." Bioinformatics 33.19 (2017): 3028-3035.
 
[2]        Jiecong, Lin. & Ka-Chun, Wong*. (*Corresponding Authorship 2018). Off-target predictions in CRISPR-Cas9 gene editing using deep learning (ECCB 2018 Proceeding Special Issue). Bioinformatics

About this Lecture

Number of Slides:  70
Duration:  45 minutes
Languages Available:  English
Last Updated: 

Request this Lecture

To request this particular lecture, please complete this online form.

Request a Tour

To request a tour with this speaker, please complete this online form.

All requests will be sent to ACM headquarters for review.