Interpretable Machine Learning Prediction of All-cause Mortality

Wei Qiu, Hugh Chen, Ayse Berceste Dincer, and Su-In Lee (Paul G. Allen School of Computer Science and Engineering, University of Washington)

Abstract: Explainable artificial intelligence provides an opportunity to improve prediction accuracy over standard linear models using 'black box' machine learning (ML) models while still revealing insights into a complex outcome such as all-cause mortality. We propose the IMPACT (Interpretable Machine learning Prediction of All-Cause morTality) framework that implements and explains complex, non-linear ML models in epidemiological research, by combining a tree ensemble mortality prediction model and an explainability method. We use 133 variables from NHANES 1999-2014 datasets (number of samples: ?? = 47, 261) to predict all-cause mortality. To explain our model, we extract local (i.e., per-sample) explanations to verify well-studied mortality risk factors, and make new dis- coveries. We present major factors for predicting ??-year mortality (?? = 1, 3, 5) across different age groups and their individualized im- pact on mortality prediction. Moreover, we highlight interactions between risk factors associated with mortality prediction, which leads to findings that linear models do not reveal. We demonstrate that compared with traditional linear models, tree-based models have unique strengths such as: (1) improving prediction power, (2) making no distribution assumptions, (3) capturing non-linear relationships and important thresholds, (4) identifying feature interactions, and (5) detecting different non-linear relationships between models. Given the popularity of complex ML models in prognostic research, combining these models with explainability methods has implications for further applications of ML in medical fields. To our knowledge, this is the first study that combines complex ML models and state-of-the-art feature attributions to explain mortality prediction, which enables us to achieve higher prediction accuracy and gain new insights into the effect of risk factors on mortality.