Explainable ML: Understanding the Limits and Pushing the Boundaries

Hima Lakkaraju

Abstract: As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these black boxes in a post hoc manner. Such explanations are being leveraged by domain experts to diagnose systematic errors and underlying biases of black boxes. However, recent research has shed light on the vulnerabilities of popular post hoc explanation techniques. In this tutorial, I will provide a brief overview of post hoc explanation methods with special emphasis on feature attribution methods such as LIME and SHAP. I will then discuss recent research which demonstrates that these methods are brittle, unstable, and are vulnerable to a variety of adversarial attacks. Lastly, I will present two solutions to address some of the vulnerabilities of these methods – (i) a generic framework based on adversarial training that is designed to make post hoc explanations more stable and robust to shifts in the underlying data, and (ii) a Bayesian framework that captures the uncertainty associated with post hoc explanations and in turn allows us to generate reliable explanations which satisfy user specified levels of confidence. Overall, this tutorial will provide a bird’s eye view of the state-of-the-art in the burgeoning field of explainable machine learning.

Bio: Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io