Ali Sharifi Zarchi
Using Deep Neural Networks to Understand the Cell Identity by Expression Fingerprints
Halls department, Hall 3
Thursday, 29 December 2016
10:30 - 11:30
Understanding the cell identity is a critically important task in many biomedical areas, such as regenerative medicine and cancer research. The expression patterns of some marker genes have been used to assign the cells to a limited number of cell types. The limitations are unknown markers to accurately characterize many cell types and the expression of markers in more than one cell type. A possible answer is using the whole-genome gene expression profiles (GEPs), but it has been computationally challenging to decide which genes can more accurately characterize the cell identity. Classical machine learning approaches, such as simple classification or clustering algorithms, have been applied for this problem as well as many other biological problems. Many aspects of biology, however, are much more sophisticated than can be modeled accurately using the simple approaches. During the past few years, the deep learning methods have provided promising results in learning different patterns in games, images and video, etc. Their application in biology and health, however, has been limited. Here we analyzed a massive number of gene and miRNA expression profiles, measured by both Microarray and Next Generation Sequencing (NGS) platforms, to learn the more sophisticated biological properties of the data. After analyzing different architectures, we identified a specific architecture of the deep autoencoders that can compress the whole gene expression profiles into a small gene expression fingerprint (GEF) consisting of as few as 30 numeric values, that can reproduce the expression values of tens of thousands of genes with an accuracy comparable to technical replicates of the same experiment. We show that the scalars of the GEFs represent different biological pathways or processes, which are learned in an unsupervised approach. Furthermore, the cell identity can be inferred from the GEFs at very high accuracy, comparable to the state-of-the-art tools that work on the whole GEP.
Ali Sharifi-Zarchi received his Bachelor and Master degrees from the Sharif University of Technology in Computer Engineering, and his Ph.D. degree in bioinformatics form the Institute of Biochemistry and Biophysics in the University of Tehran under the supervision of Dr. Mehdi Sadeghi and Dr. Hamid Pezeshk. Now he is doing bioinformatics research at Royan Institute and is an assistant prof. of bioinformatics at the department of computer engineering at the Sharif University of Technology.