Disease state prediction from single-cell data using graph attention networks

Neal Ravindra, Arijit Sehanobish, Jenna L. Pappalardo, David A. Hafler, David van Dijk

View paper in the ACM journal

Abstract: Single-cell RNA sequencing (scRNA-seq) has revolutionized bio-logical discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into health and disease, it has not been used for disease prediction or diagnostics. Graph Attention Networks have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that is difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92% accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network, random forest, and multi-layer perceptron. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the difference between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding which we visualize with PHATE and UMAP. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.