Keynotes from CHIL 2022

Algorithmic fairness and the science of health disparities

Rumi Chunara / New York University

Abstract: It has been shown that equalizing health disparities can avert more deaths than the number of lives saved by medical advances alone in the same time frame. Moreover, without a simultaneous focus on innovations and equity, advances in health for one group can occur at the cost of added challenges for another. In this talk I will introduce the science of health disparities and juxtapose it with the machine learning subfield of algorithmic fairness. Given the key foci and principles of health equity and health disparities within public and population health, I will show examples of how machine learning and principles of public and population health can be synergized for using data to advance the science of health disparities and sustainable health of entire populations.

Bio: Dr. Rumi Chunara is an Associate Professor at New York University, jointly appointed at the Tandon School of Engineering (in Computer Science) and the School of Global Public Health (in Biostatistics/Epidemiology). Her PhD is from the Harvard-MIT Division of Health Sciences and Technology and her BSc from Caltech. Her research group focuses on developing computational and statistical approaches for acquiring, integrating and using data to improve population and public health. She is an MIT TR35, NSF Career, Bill & Melinda Gates Foundation Grand Challenges, Facebook Research and Max Planck Sabbatical award winner.

Machine Learning for Human Genetics: A Multi-Scale View on Complex Traits and Disease

Lorin Crawford / Microsoft Research New England; Brown University

Abstract: A common goal in genome-wide association (GWA) studies is to characterize the relationship between genotypic and phenotypic variation. Linear models are widely used tools in GWA analyses, in part, because they provide significance measures which detail how individual single nucleotide polymorphisms (SNPs) are statistically associated with a trait or disease of interest. However, traditional linear regression largely ignores non-additive genetic variation, and the univariate SNP-level mapping approach has been shown to be underpowered and challenging to interpret for certain trait architectures. While machine learning (ML) methods such as neural networks are well known to account for complex data structures, these same algorithms have also been criticized as “black box” since they do not naturally carry out statistical hypothesis testing like classic linear models. This limitation has prevented ML approaches from being used for association mapping tasks in GWA applications. In this talk, we present flexible and scalable classes of Bayesian feedforward models which provide interpretable probabilistic summaries such as posterior inclusion probabilities and credible sets which allows researchers to simultaneously perform (i) fine-mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. While analyzing real data assayed in diverse self-identified human ancestries from the UK Biobank, the Biobank Japan, and the PAGE consortium we demonstrate that interpretable ML has the power to increase the return on investment in multi-ancestry biobanks. Furthermore, we highlight that by prioritizing biological mechanism we can identify associations that are robust across ancestries---suggesting that ML can play a key role in making personalized medicine a reality for all.

Bio: Lorin Crawford is a Senior Researcher at Microsoft Research New England. He also holds a position as the RGSS Assistant Professor of Biostatistics at Brown University. His scientific research interests involve the development of novel and efficient computational methodologies to address complex problems in statistical genetics, cancer pharmacology, and radiomics (e.g., cancer imaging). Dr. Crawford has an extensive background in modeling massive data sets of high-throughput molecular information as it pertains to functional genomics and cellular-based biological processes. His most recent work has earned him a place on Forbes 30 Under 30 list, The Root 100 Most Influential African Americans list, and recognition as an Alfred P. Sloan Research Fellow and a David & Lucile Packard Foundation Fellowship for Science and Engineering. Before joining Brown, Dr. Crawford received his PhD from the Department of Statistical Science at Duke University and received his Bachelor of Science degree in Mathematics from Clark Atlanta University.

Understanding Heterogeneity as a Route to Understanding Health

Danielle Belgrave / DeepMind

Abstract: Machine learning presents an opportunity to understand the patient journey over high dimensional data in the clinical context. This is aligned to one of the foundational issues of machine learning for healthcare: how do you represent a patient state. Improving state representations allows us to (i) visualise/cluster deteriorating patients, (ii) understand the patient journey and thus heterogeneous pathways to improvement or clinical deterioration which encompasses different data modalities; and thus (iii) more quickly identify situations for intervention. In this talk, I present motivating examples of understanding heterogeneity as a route towards understanding health and personalising healthcare interventions.

Bio: Danielle Belgrave is a Senior Staff Research Scientist at DeepMind. Prior to joining DeepMind she worked in the Healthcare Intelligence group at Microsoft Research and was a tenured research fellow at Imperial College London. Her research focuses on integrating medical domain knowledge, machine learning and causal modelling frameworks to understand health. She obtained a BSc in Mathematics and Statistics from London School of Economics, an MSc in Statistics from University College London and a PhD in the area of machine learning in health applications from the University of Manchester.

Data Science against COVID-19

Nuria Oliver / ELLIS

Abstract: In my talk, I will describe the work that I have been doing since March 2020, leading a multi-disciplinary team of 20+ volunteer scientists working very closely with the Presidency of the Valencian Government in Spain on 4 large areas: (1) human mobility modeling; (2) computational epidemiological models (both metapopulation, individual and LSTM-based models); (3) predictive models; and (4) a large-scale, online citizen surveys called the COVID19impactsurvey (https://covid19impactsurvey.org) with over 700,000 answers worldwide. This survey has enabled us to shed light on the impact that the pandemic is having on people's lives. I will present the results obtained in each of these four areas, including winning the 500K XPRIZE Pandemic Response Challenge and obtaining a best paper award at ECML-PKDD 2021. I will share the lessons learned in this very special initiative of collaboration between the civil society at large (through the survey), the scientific community (through the Expert Group) and a public administration (through the Commissioner at the Presidency level). For those interested in knowing more, WIRED magazine published an extensive article describing our story: https://www.wired.co.uk/article/valencia-ai-covid-data.

Bio: Nuria Oliver is Co-founder and Vice-president of ELLIS (The European Laboratory for Learning and Intelligent Systems), Co-founder and Director of the ELLIS Unit Alicante, Chief Data Scientist at Data-Pop Alliance and Chief Scientific Advisor to the Vodafone Institute. Nuria earned her PhD from MIT. She is a Fellow of the ACM, IEEE and EurAI. She is the youngest member (and fourth female) in the Spanish Royal Academy of Engineering. She is also the only Spanish scientist at SIGCHI Academy. She has over 25 years of research experience in human-centric AI and is the author of over 180 widely cited scientific articles as well as an inventor of 40+ patents and a public speaker. Her work is regularly featured in the media and has received numerous recognitions, including the Spanish National Computer Science Award, the MIT TR100 (today TR35), Young Innovator Award (first Spanish scientist to receive this award); the 2020 Data Scientist of the Year by ESRI, the 2021 King Jaume I award in New Technologies and the 2021 Abie Technology Leadership Award. In March of 2020, she was appointed Commissioner to the President of the Valencian Government on AI Strategy and Data Science against COVID-19. In that role, she has recently co-led ValenciaIA4COVID, the winning team of the 500k XPRIZE Pandemic Response Challenge. Their work was featured in WIRED, among other media.

Machine Learning in Public Health: are we there yet?

Jessica Tenenbaum / North Carolina Department of Health and Human Services; Duke University School of Medicine

Abstract: Spoiler alert: No. And yes, it is much, much further. Public health has not traditionally been a data-driven field. The good news is that has been changing in recent years, accelerated significantly by the COVID epidemic. But public health and human services organizations have many more fundamental things to worry about before we will have the luxury of considering what machine learning can enable. These fundamentals include data-related facets such as electronic data capture and exchange, data quality, data governance, information technology infrastructure, and data management best practices. In addition, data literacy, workforce development, and compensation that is a fraction of what 'quants' can earn in industry are also major stumbling blocks toward advanced analytics in public health. At the start of the COVID pandemic, many communicable diseases were reporting by fax machine and then hand-entered into a database. Although there was significant interest in predictive modeling to project hospital capacity out in the future, even the most sophisticated models were of limited use to policy makers beyond basic trends and observations from the front lines. The most notable exception, where AI is in fact proving useful in public health, is in the use of 'robotic process automation' (RPA) as a band-aid for poorly designed systems that require mindless human intervention. These tools serve as workarounds for systems that lack interoperability by emulating human users to do the grunt work of data entry and wrangling. This talk will be a reality check from the trenches of state government on the heels of the COVID-19 pandemic.

Bio: Dr. Tenenbaum serves as the Chief Data Officer (CDO) for DHHS, where she oversees data strategy across the Department enabling the use of information to inform and evaluate policy and improve the health and well-being of residents of North Carolina. Prior to taking on the role of CDO, Dr. Tenenbaum was a founding faculty member of the Division of Translational Biomedical Informatics within Duke University's Department of Biostatistics and Bioinformatics where her research focused on informatics methods to enable precision medicine, particularly in mental health. She is also interested in ethical, legal, and social issues around big data and precision medicine. Nationally, Dr. Tenenbaum has served as Associate Editor for the Journal of Biomedical Informatics and as an elected member of the Board of Directors for the American Medical Informatics Association (AMIA). She currently serves on the Board of Scientific Counselors for the National Library of Medicine. After earning her bachelor's degree in biology from Harvard, Dr. Tenenbaum was a Program Manager at Microsoft Corporation in Redmond, WA for six years before pursuing a PhD in biomedical informatics at Stanford University. Dr. Tenenbaum is a strong promoter and advocate of young women interested in STEM (science, technology, engineering, and math) careers.

Reducing bias in machine learning systems: Understanding drivers of pain

Jure Leskovec / Stanford University

Abstract: AI systems tend to amplify biases and disparities. When we feed them data that reflects our biases, they mimic them---from antisemitic chatbots to racially biased software. In this talk I am going to discuss two examples how AI can help us reduce biases and disparities. First I am going to explain how we can use AI to understand why underserved populations experience higher levels of pain. This is true even after controlling for the objective severity of diseases like osteoarthritis, as graded by human physicians using medical images, which raises the possibility that underserved patients’ pain stems from factors external to the knee, such as stress. We develop a deep learning approach to measure the severity of osteoarthritis, by using knee X-rays to predict patients’ experienced pain and show that this approach dramatically reduces unexplained racial disparities in pain.

Bio: Jure Leskovec is an associate professor of Computer Science at Stanford University, the Chief Scientist at Pinterest, and an Investigator at the Chan Zuckerberg Biohub. He co-founded a machine learning startup Kosei, which was later acquired by Pinterest. Leskovec's research area is machine learning and data science for complex, richly-labeled relational structures, graphs, and networks for systems at all scales, from interactions of proteins in a cell to interactions between humans in a society. Applications include commonsense reasoning, recommender systems, social network analysis, computational social science, and computational biology with an emphasis on drug discovery. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University. You can follow him on Twitter at @jure.