
Biomedical Question Answering Yesterday, Today, and Tomorrow
Abstract: The traditional knowledge-based approaches to question answering might seem irrelevant now that Neural QA, particularly Large Language Models show almost human performance in question answering. Knowing what was successful in the past and which elements are essential to getting the right answers, however, is needed to inform further developments in the neural approaches and help address the known shortcomings of LLMs. This talk, therefore, will provide an overview of the approaches to biomedical question answering as they were evolving. It will cover information needs of various stakeholders and the resources created to address these information needs through Question Answering.
Bio: Dina Demner-Fushman, MD, PhD is a Tenure Track Investigator in the Computational Health Research Branch at LHNCBC. She specializes in artificial intelligence and natural language processing, with a focus on information extraction and textual data analysis, EMR data analysis, and image and text retrieval for clinical decision support and education. Dr. Demner-Fushman's research aims to improve healthcare through the development of computational methods that can process and analyze clinical data more effectively. Her research led to the current iteration of the MEDLINE resource, which helps people navigate a plethora of NLM resources, as well as Open-i, which helps finding biomedical images.

Safe Deployment of Medical Imaging AI
Abstract: Artificial intelligence could fundamentally transform clinical workflows in image-based diagnostics and population screening, promising more objective, accurate and effective analysis of medical images. A major hurdle for using medical imaging AI in clinical practice, however, is the assurance whether it is safe for patients and continues to be safe after deployment. Differences in patient populations and changes in the data acquisition pose challenges to today's AI algorithms. In this talk we will discuss AI safeguards from the perspective of robustness, reliability, and fairness. We will explore approaches for automatic failure detection, monitoring of performance, and analysis of bias, aiming to ensure the safe and ethical use of medical imaging AI.
Bio: Ben Glocker is Professor in Machine Learning for Imaging and Kheiron Medical Technologies / Royal Academy of Engineering Research Chair in Safe Deployment of Medical Imaging AI. He co-leads the Biomedical Image Analysis Group, leads the HeartFlow-Imperial Research Team, and is Head of ML Research at Kheiron. His research is at the intersection of medical imaging and artificial intelligence aiming to build safe and ethical computational tools for improving image-based detection and diagnosis of disease.

Biological Sequence Modeling in Research and Applications
Abstract: Biological sequences, like DNA and protein sequences, encode genetic information essential to life. In recent times, deep learning techniques have transformed biomedical research and applications by modeling the intricate patterns in these sequences. Successful models like AlphaFold and Enformer have paved the way for accurate end-to-end prediction of complex molecular phenotypes from sequences. Such models have profound impact on biomedical research and applications, ranging from understanding basic biology to facilitating drug discovery. This talk will provide an overview of the current techniques and status of biological sequences modeling. Additionally, specific applications of such models in genetics and immunology will be discussed.
Bio: Jun Cheng is a Senior Research Scientist at DeepMind. His research focused on developing machine learning methods to better understand the genetic code and disease mechanisms. Before that, he was a scientist at NEC Labs Europe, where he worked on personalized cancer vaccines. His work has been published in venues such as Genome Biology, Bioinformatics, and Nature Biotechnology. He received his PhD in computational biology from the Technical University of Munich.

Bridging Machine Learning and Collaborative Action Research: A Tale Engaging with Diverse Stakeholders in Digital Mental Health
Abstract: Digital traces, such as social media data, supported with advances in the artificial intelligence (AI) and machine learning (ML) fields, are increasingly being used to understand the mental health of individuals, communities, and populations. However, such algorithms do not exist in a vacuum -- there is an intertwined relationship between what an algorithm does and the world it exists in. Consequently, with algorithmic approaches offering promise to change the status quo in mental health for the first time since mid-20th century, interdisciplinary collaborations are paramount. But what are some paradigms of engagement for AL/ML researchers that augment existing algorithmic capabilities while minimizing the risk of harm? Adopting a social ecological lens, this talk will describe the experiences from working with different stakeholders in research initiatives relating to digital mental health – including with healthcare providers, grassroots advocacy and public health organizations, and people with the lived experience of mental illness. The talk hopes to present some lessons learned by way of these engagements, and to reflect on a path forward that empowers us to go beyond technical innovations to envisioning contributions that center humans’ needs, expectations, values, and voices within those technical artifacts.
Bio: Munmun De Choudhury is an Associate Professor of Interactive Computing at Georgia Tech. Dr. De Choudhury is best known for laying the foundation of a new line of research that develops computational techniques towards understanding and improving mental health outcomes, through ethical analysis of social media data. To do this work, she adopts a highly interdisciplinary approach, combining social computing, machine learning, and natural language analysis with insights and theories from the social, behavioral, and health sciences. Dr. De Choudhury has been recognized with the 2023 SIGCHI Societal Impact Award, the 2022 Web Science Trust Test-of-Time Award, the 2021 ACM-W Rising Star Award, the 2019 Complex Systems Society – Junior Scientific Award, numerous best paper and honorable mention awards from the ACM and AAAI, and features and coverage in popular press like the New York Times, the NPR, and the BBC. Earlier, Dr. De Choudhury was a faculty associate with the Berkman Klein Center for Internet and Society at Harvard, a postdoc at Microsoft Research, and obtained her PhD in Computer Science from Arizona State University.

Title to be Announced
Abstract: TBD
Bio: Dina Katabi is the Thuan and Nicole Pham Professor of Electrical Engineering and Computer Science at MIT. She is also the director of the MIT’s Center for Wireless Networks and Mobile Computing, a member of the National Academy of Engineering, and a recipient of the MacArthur Genius Award. Professor Katabi received her PhD and MS from MIT in 2003 and 1999, and her Bachelor of Science from Damascus University in 1995. Katabi's research focuses on innovations in digital health, applied machine learning and wireless sensors and networks. Her research has been recognized with ACM Prize in Computing, the ACM Grace Murray Hopper Award, two SIGCOMM Test-of-Time Awards, the Faculty Research Innovation Fellowship, a Sloan Fellowship, the NBX Career Development chair, and the NSF CAREER award. Her students received the ACM Best Doctoral Dissertation Award in Computer Science and Engineering twice. Further, her work was recognized by the IEEE William R. Bennett prize, three ACM SIGCOMM Best Paper awards, an NSDI Best Paper award and a TR10 award. Several start-ups have beenspun out of Katabi's lab such as PiCharging and Emerald.

Skin in the Game: The State of AI in Dermatology
Abstract: Artificial intelligence tools have been touted as having performance "on par" with board certified dermatologists. However, these published claims have not translated to real world practice. In this talk, I will discuss the opportunities and challenges for AI in dermatology.
Bio: Dr. Roxana Daneshjou received her undergraduate degree at Rice University in Bioengineering, where she was recognized as a Goldwater Scholar for her research. She completed her MD/PhD at Stanford, where she worked in the lab of Dr. Russ Altman. During this time, she was a Howard Hughes Medical Institute Medical Scholar and a Paul and Daisy Soros Fellowship for New Americans Fellow. She completed dermatology residency at Stanford in the research track and now practices dermatology as a Clinical Scholar in Stanford's Department of Dermatology while also conducting artificial intelligence research with Dr. James Zou as a postdoc in Biomedical Data Science. She is an incoming assistant professor of biomedical data science and dermatology at Stanford in Fall of 2023. Her research interests are in developing diverse datasets and fair algorithms for applications in precision medicine.

Using Machine Learning to Increase Equity in Healthcare and Public Health
Abstract: Our society remains profoundly unequal. This talk discusses how data science and machine learning can be used to combat inequality in health care and public health by presenting several vignettes from domains like medical testing and cancer risk prediction.
Bio: Emma Pierson is an assistant professor of computer science at the Jacobs Technion-Cornell Institute at Cornell Tech and the Technion, and a computer science field member at Cornell University. She holds a secondary joint appointment as an Assistant Professor of Population Health Sciences at Weill Cornell Medical College. She develops data science and machine learning methods to study inequality and healthcare. Her work has been recognized by best paper, poster, and talk awards, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, and Forbes 30 Under 30 in Science. Her research has been published at venues including ICML, KDD, WWW, Nature, and Nature Medicine, and she has also written for The New York Times, FiveThirtyEight, Wired, and various other publications.

Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
Bio: Isaac “Zak” Kohane, MD, PhD, is the inaugural chair of Harvard Medical School’s Department of Biomedical Informatics, whose mission is to develop the methods, tools, and infrastructure required for a new generation of scientists and care providers to move biomedicine rapidly forward by taking advantage of the insight and precision offered by big data. Kohane develops and applies computational techniques to address disease at multiple scales, from whole health care systems to the functional genomics of neurodevelopment. He also has worked on AI applications in medicine since the 1990’s, including automated ventilator control, pediatric growth monitoring, detection of domestic abuse, diagnosing autism from multimodal data and most recently assisting clinicians using whole genome sequence and clinical histories to diagnose rare or unknown disease patients. His most urgent question is how to enable doctors to be most effective and enjoy their profession when they enter into a substantial symbiosis with machine intelligence. He is a member of the National Academy of Medicine, the American Society for Clinical Investigation and the American College of Medical Informatics.

Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
Bio: Leo focuses on scaling clinical research to be more inclusive through open access data and software, particularly for limited resource settings; identifying bias in the data to prevent them from being encrypted in models and algorithms; and redesigning research using the principles of team science and the hive learning strategy.

Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
Bio: Jason Fries is a research scientist at the Shah Lab at Stanford University. His work is centered on enabling domain experts to easily construct and modify machine learning models, particularly in the field of medicine where expert-labeled training data are hard to acquire. His research interests include weakly supervised machine learning, foundation models for medicine, and data-centric AI.

Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
Bio: Lauren Oakden-Rayner is a radiologist and Senior Research Fellow at the Australian Institute for Machine Learning, University of Adelaide. Her research primarily focuses on medical AI safety, specifically addressing the issues of model robustness, generalization, evaluation, and fairness. Lauren is also involved in supervising students and working on various medical AI projects, reviewing MOOCs on her blog, and advocating for diversity in her group and Institute.

Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
Bio: Maia Hightower, MD, MBA, MPH, is an accomplished healthcare IT executive and internist. She currently serves as the executive Vice President and Chief Digital & Technology Officer at the University of Chicago Medicine and the CEO and co-founder of Equality AI, a startup aimed at achieving health equity through responsible AI and machine-learning operations. Previously, she was the chief medical information officer and associate chief medical officer at University of Utah Health and served in similar roles at University of Iowa Health Care and Stanford Health Care. Dr. Hightower's work has focused on leveraging digital technology to address health inequities and promoting diversity and inclusion within healthcare IT systems. Her leadership in the field has earned her widespread recognition.

Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
Bio: Marzyeh Ghassemi is an assistant professor and the Hermann L. F. von Helmholtz Professor with appointments in the Department of Electrical Engineering and Computer Science and the Institute for Medical Engineering & Science at MIT. Ghassemi’s research interests span representation learning, behavioral ML, healthcare ML, and healthy ML. One of her focuses is on real-world applications of machine learning, such as turning diverse clinical data into cohesive information with the ability to predict patient needs. Ghassemi has received BS degrees in computer science and electrical engineering from New Mexico State University, an MSc degree in biomedical engineering from Oxford University, and PhD in computer science from MIT.

Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
Bio: Ziad Obermeyer is Associate Professor and Blue Cross of California Distinguished Professor at UC Berkeley, where he works at the intersection of machine learning and health. He is a Chan Zuckerberg Biohub Investigator, a Faculty Research Fellow at the National Bureau of Economic Research, and was named an Emerging Leader by the National Academy of Medicine. Previously, he was Assistant Professor at Harvard Medical School, and continues to practice emergency medicine in underserved communities.

Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
Bio: Dr. Halamka is an emergency medicine physician, medical informatics expert and president of the Mayo Clinic Platform, which is focused on transforming health care by leveraging artificial intelligence, connected health care devices and a network of partners. Dr. Halamka has been developing and implementing health care information strategy and policy for more than 25 years. Previously, he was executive director of the Health Technology Exploration Center for Beth Israel Lahey Health, chief information officer at Beth Israel Deaconess Medical Center, and International Healthcare Innovation Professor at Harvard Medical School. He is a member of the National Academy of Medicine.

Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
Bio: Elaine Nsoesie is an Associate Professor at Boston University's School of Public Health and a leading voice in the use of data and technology to advance health equity. She is leads the Racial Data Tracker project at Boston University's Center for Antiracist Research and serves as a Senior Advisor to the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program at the National Institutes of Health. Dr. Nsoesie has published extensively on the use of data from social media, search engines, and cell phones for public health surveillance and is dedicated to increasing representation of underrepresented communities in data science. She completed her PhD in Computational Epidemiology from Virginia Tech and has held postdoctoral positions at Harvard Medical School and Boston Children's Hospital.

Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
Bio: Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.

Invited Talk on Research and Top Recent Papers from 2020-2022
Bio: Suchi Saria, PhD, holds the John C. Malone endowed chair and is the Director of the Machine Learning, AI and Healthcare Lab at Johns Hopkins. She is also is the Founder and CEO of Bayesian Health. Her research has pioneered the development of next generation diagnostic and treatment planning tools that use statistical machine learning methods to individualize care. She has written several of the seminal papers in the field of ML and its use for improving patient care and has given over 300 invited keynotes and talks to organizations including the NAM, NAS, and NIH. Dr. Saria has served as an advisor to multiple Fortune 500 companies and her work has been funded by leading organizations including the NIH, FDA, NSF, DARPA and CDC.Dr. Saria’s has been featured by the Atlantic, Smithsonian Magazine, Bloomberg News, Wall Street Journal, and PBS NOVA to name a few. She has won several awards for excellence in AI and care delivery. For example, for her academic work, she’s been recognized as IEEE’s “AI’s 10 to Watch”, Sloan Fellow, MIT Tech Review’s “35 Under 35”, National Academy of Medicine’s list of “Emerging Leaders in Health and Medicine”, and DARPA’s Faculty Award. For her work in industry bringing AI to healthcare, she’s been recognized as World Economic Forum’s 100 Brilliant Minds Under 40, Rock Health’s “Top 50 in Digital Health”, Modern Healthcare’s Top 25 Innovators, The Armstrong Award for Excellence in Quality and Safety and Society of Critical Care Medicine’s Annual Scientific Award.

Invited Talk on Recent Deployments and Real-world Impact
Bio: Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women’s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master’s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.

Invited Talk on Under-explored Research Challenges and Opportunities
Bio: Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.

Machine Learning for Healthcare in the Era of ChatGPT
Bio: Byron Wallace is the Sy and Laurie Sternberg Interdisciplinary Associate Professor and Director of the BS in Data Science program at Northeastern University in the Khoury College of Computer Sciences. His research is primarily in natural language processing (NLP) methods, with an emphasis on their application in healthcare and the challenges inherent to this domain.

Network studies: As many databases as possible or enough to answer the question quickly?
Bio: Dr. Chute is the Bloomberg Distinguished Professor of Health Informatics, Professor of Medicine, Public Health, and Nursing at Johns Hopkins University, and Chief Research Information Officer for Johns Hopkins Medicine. He is also Section Head of Biomedical Informatics and Data Science and Deputy Director of the Institute for Clinical and Translational Research. He received his undergraduate and medical training at Brown University, internal medicine residency at Dartmouth, and doctoral training in Epidemiology and Biostatistics at Harvard. He is Board Certified in Internal Medicine and Clinical Informatics, and an elected Fellow of the American College of Physicians, the American College of Epidemiology, HL7, the American Medical Informatics Association, and the American College of Medical Informatics (ACMI), as well as a Founding Fellow of the International Academy of Health Sciences Informatics; he was president of ACMI 2017-18. He is an elected member of the Association of American Physicians. His career has focused on how we can represent clinical information to support analyses and inferencing, including comparative effectiveness analyses, decision support, best evidence discovery, and translational research. He has had a deep interest in the semantic consistency of health data, harmonized information models, and ontology. His current research focuses on translating basic science information to clinical practice, how we classify dysfunctional phenotypes (disease), and the harmonization and rendering of real-world clinical data including electronic health records to support data inferencing. He became founding Chair of Biomedical Informatics at Mayo Clinic in 1988, retiring from Mayo in 2014, where he remains an emeritus Professor of Biomedical Informatics. He is presently PI on a spectrum of high-profile informatics grants from NIH spanning translational science including co-lead on the National COVID Cohort Collaborative (N3C). He has been active on many HIT standards efforts and chaired ISO Technical Committee 215 on Health Informatics and chaired the World Health Organization (WHO) International Classification of Disease Revision (ICD-11).

Network studies: As many databases as possible or enough to answer the question quickly?
Bio: Robert Platt is Professor in the Departments of Epidemiology, Biostatistics, and Occupational Health, and of Pediatrics, at McGill University. He holds the Albert Boehringer I endowed chair in Pharmacoepidemiology, and is Principal Investigator of the Canadian Network for Observational Drug Effect Studies (CNODES). His research focuses on improving statistical methods for the study of medications using administrative data, with a substantive focus on medications in pregnancy. Dr. Platt is an editor-in-chief of Statistics in Medicine and is on the editorial boards of the American Journal of Epidemiology and Pharmacoepidemiology and Drug Safety. He has published over 400 articles, one book and several book chapters on biostatistics and epidemiology.

Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?
Bio: Tianxi Cai is John Rock Professor of Translational Data Science at Harvard, with joint appointments in the Biostatistics Department and the Department of Biomedical Informatics. She directs the Translational Data Science Center for a Learning Health System at Harvard Medical School and co-directs the Applied Bioinformatics Core at VA MAVERIC. She is a major player in developing analytical tools for mining multi-institutional EHR data, real world evidence, and predictive modeling with large scale biomedical data. Tianxi received her Doctor of Science in Biostatistics at Harvard and was an assistant professor at the University of Washington before returning to Harvard as a faculty member in 2002.

Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?
Bio: Dr. Yong Chen is Professor of Biostatistics at the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania (Penn). He directs a Computing, Inference and Learning Lab at University of Pennsylvania, which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He has published over 170 peer-reviewed papers in a wide spectrum of methodological and clinical areas. During the pandemic, Dr. Chen is serving as Director of Biostatistics Core for Pedatric PASC of the RECOVER COVID initiative which a national multi-center RWD-based study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 13 million patients across more than 10 health systems. He is an elected fellow of the American Statistical Association, the American Medical Informatics Association, Elected Member of the International Statistical Institute, and Elected Member of the Society for Research Synthesis Methodology.

Differential Privacy vs. Synthetic Data
Bio: Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.

Differential Privacy vs. Synthetic Data
Bio: Li Xiong is a Samuel Candler Dobbs Professor of Computer Science and Professor of Biomedical Informatics at Emory University. She held a Winship Distinguished Research Professorship from 2015-2018. She has a Ph.D. from Georgia Institute of Technology, an MS from Johns Hopkins University, and a BS from the University of Science and Technology of China. She and her research lab, Assured Information Management and Sharing (AIMS), conduct research on algorithms and methods at the intersection of data management, machine learning, and data privacy and security, with a recent focus on privacy-enhancing and robust machine learning. She has published over 170 papers and received six best paper or runner up awards. She has served and serves as associate editor for IEEE TKDE, IEEE TDSC, and VLDBJ, general co-chair for ACM CIKM 2022, program co-chair for IEEE BigData 2020 and ACM SIGSPATIAL 2018, 2020, program vice-chair for ACM SIGMOD 2024, 2022, and IEEE ICDE 2023, 2020, and VLDB Sponsorship Ambassador. Her research is supported by federal agencies including NSF, NIH, AFOSR, PCORI, and industry awards including Google, IBM, Cisco, AT&T, and Woodrow Wilson Foundation. She is an IEEE felllow.

Changing patient trajectory: A case study exploring implementation and deployment of clinical machine learning models
Abstract: You’ve created an awesome model that predicts with near 100 percent accuracy. Now what? In this tutorial, we will give insight into the implementation, deployment, integration, and evaluation steps following the building of a clinical model. Specifically, we will discuss each step in the context of informing design choices as you build a model. For example, aggressive feature selection is a necessary step toward integration as real time data streams of all the data points a machine learning model may consume may not be accessible or feasible. We will use our implementation and evaluation of a Covid-19 adverse event model at our institution as a representative case study. This case study will demonstrate the full lifecycle of a clinical model and how we transition from a model to affecting patient outcome and the socio-technical challenges for success.
Bio: Yindalon Aphinyanaphongs, MD, PhD (Predictive Analytics Team Lead) is a physician scientist in the Center for Healthcare Innovation and Delivery Science in the Department of Population Health at NYU Langone Health in New York City. Academically, he is an assistant professor and his lab focuses on novel applications of machine learning to clinical problems and the science behind successful translation of predictive models into clinical practice to drive value. Operationally, he is the Director of Operational Data Science and Machine Learning at NYU Langone Health. In this role, he leads a Predictive Analytics Unit composed of data scientists and engineers that build, evaluate, benchmark, and deploy predictive algorithms into the clinical enterprise.

Distributed Statistical Learning and Inference with Electronic Health Records Data
Abstract: The growth of availability and variety of healthcare data sources has provided unique opportunities for data integration and evidence synthesis, which can potentially accelerate knowledge discovery and enable better clinical decision-making. However, many practical and technical challenges, such as data privacy, high-dimensionality and heterogeneity across different datasets, remain to be addressed. In this talk, I will introduce several methods for the effective and efficient integration of electronic health records and other healthcare datasets. Specifically, we develop communication-efficient distributed algorithms for jointly analyzing multiple datasets without the need of sharing patient-level data. Our algorithms can account for heterogeneity across different datasets. We provide theoretical guarantees for the performance of our algorithms, and examples of implementing the algorithms to real-world clinical research networks.
Bio: Dr. Duan is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. She received her Ph.D. in Biostatistics in May 2020 from the University of Pennsylvania. Her research interests focus on three distinct areas: methods for integrating evidence from different data sources, identifying signals from high dimensional data, and accounting for suboptimality of real-world data, such as missing data and measurement errors.

Challenges in Developing Online Learning and Experimentation Algorithms in Digital Health
Abstract: Digital health technologies provide promising ways to deliver interventions outside of clinical settings. Wearable sensors and mobile phones provide real-time data streams that provide information about an individual’s current health including both internal (e.g., mood) and external (e.g., location) contexts. This tutorial discusses the algorithms underlying mobile health clinical trials. Specifically, we introduce the micro-randomized trial (MRT), an experimental design for optimizing real time interventions. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT. Attendees will have access to synthetic digital health experimental data to better understand online learning and experimentation algorithms, the systems underlying real time delivery of treatment, and their evaluation using collected data.
Bio: Walter Dempsey is an Assistant Professor of Biostatistics and an Assistant Research Professor in the d3lab located in the Institute of Social Research. My research focuses on Statistical Methods for Digital and Mobile Health. My current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks.

Causal Inference from Text Data
Abstract: Does increasing the dosage of a drug treatment cause adverse reactions in patients? This is a causal question: did increased drug dosage cause some patients to have an adverse reaction, or would they have had the reaction anyway due to other factors? A classical approach to studying this causal question from observational data involves applying causal inference techniques to observed measurements of all the relevant clinical variables. However, there is a growing recognition that abundant text data, such as medical records, physicians' notes, or even forum posts from online medical communities, provide a rich source of information for causal inference. In this tutorial, I'll introduce causal inference and highlight the unique challenges that high-dimensional and noisy text data pose. Then, I'll use two text applications involving online forums and consumer complaints to motivate recent approaches that extend natural language processing (NLP) methods in service of causal inference. I'll discuss some new assumptions we need to introduce to bridge the gap between noisy text data and valid causal inference. I'll conclude by summarizing open research questions at the intersection of causal inference and text analysis.
Bio: Dhanya Sridhar is an assistant professor at the University of Montreal and a core academic member at Mila - Quebec AI Institute. She holds a Canada CIFAR AI Chair. She was a postdoctoral researcher at Columbia University and completed her PhD at the University of California, Santa Cruz. Her research interests are at the intersection of causality and machine learning, focusing on applications to text and social network data.

'Are log scales endemic yet?' Strategies for visualizing biomedical and public health data
Abstract: Data visualization is essential for analyzing biomedical and public health data and communicating the findings to key stakeholders. However, the presence of a data visualization is not enough; the choices we make when visualizing data are equally important in establishing its understandability and impact. This tutorial will discuss strategies for visualizing data and evaluating its impact with an appropriate target audience. The aim is to build an intuition for developing and assessing visualizations by drawing on theories of visualization theories together with examples from prior research and ongoing attempts to visualize the present pandemic.
Bio: Ana Crisan is currently a senior research scientist at Tableau, a Salesforce company. She conducts interdisciplinary research that integrates techniques and methods from machine learning, human computer interaction, and data visualization. Her research focuses on the intersection of Data Science and Data Visualization, especially toward the way humans can collaboratively work together with ML/AI systems through visual interfaces. She completed her Ph.D. in Computer Science at the University of British Columbia, under the joint supervision of Dr. Tamara Muzner and Dr. Jennifer L. Gardy. Prior to that, she was a research scientist at the British Columbia Centre for Disease Control and Decipher Biosciences, where she conducted research on machine learning and data visualization research toward applications in infectious disease and cancer genomics, respectively. Her research has appeared in publications of the ACM (CHI), IEEE (TVCG, CG&A), Bioinformatics, and Nature.

Bridging the gap between the business of value-based care and the research of health AI
Value-Based Care (VBC) is getting its momentum. The Centers for Medicare and Medicaid Services (CMS) is pushing to have all Medicare fee-for-service beneficiaries under a care relationship with accountability for quality and total cost of care by 2030. However, the business of VBC is more complex and is different from other businesses as it needs to satisfy three-part aims simultaneously; they are 1) better care for individuals, 2) better health for populations, and 3) lower cost. Meeting all three aims is challenging, and the details and implications of these aims are not well-known for healthcare machine learning researchers. Therefore, we want to pick a few papers from this and past years' CHIL proceedings. Then, we would like to brainstorm and discuss how those ideas in the papers can be deployed in practice, what are the barriers to the deployment/sales, what are the hidden or visible incentives for adopting such ideas, how the government and policymakers should incentivize to achieve the three-part aims of CMS while encouraging the adoption of such technologies.

Auditing Algorithm Performance and Equity
Machine learning algorithms should be easy to evaluate for performance and equity: they generate quantitative predictions that can be compared to their intended target, both in the general population and in under-served groups. But the scarcity of data means that, for most algorithms, we have no idea how they perform, and how much bias they contain. Concretely, there is no way for algorithm developers or potential users to answer the simple question: does this algorithm do what it’s supposed to do? This roundtable will focus on the opportunities and challenges of auditing algorithm performance and equity.