Dataset Bias in Diagnostic AI systems: Guidelines for Dataset Collection and Usage

Julie R Vaughn, Avital Baral, Mayukha Vadari, William Boag

Abstract: In the last few years, the FDA has begun to recognize De Novo pathways (new approval processes) for approving AI as medical devices. A major concern with this is that the review process does not adequately test for biases in these models. There are many ways in which biases can arise in data, including during data collection, training, and model deployment. In this paper, we adopt a framework for categorizing the types of bias in datasets in a fine-grained way, which enables informed, targeted interventions for each issue appropriately. From there, we propose policy recommendations to the FDA and NIH to promote the deployment of more equitable AI diagnostic systems.