Cost-Sensitive Feature Selection Using Bayesian Optimization

Lucca G. Zenobio, Thiago N. C. Cardoso, Andrea Kauffmann, Augusto Antunes

Abstract: In many Machine Learning applications, it is important to reduce the set of features used in training. This is especially important when different attributes have different acquisition costs, e.g., various blood tests. Cost-sensitive feature selection methods aim to select a subset of attributes that yields a performant Machine Learning model while keeping the total cost low. In this paper, we propose a Bayesian Optimization approach to this task. We explore the different subsets of available features by optimizing an evaluation function that weights the model's performance and total feature cost. We evaluate the proposed method on different UCI datasets, as well as a real-life one, and compare it to diverse feature selection approaches. Our results demonstrate that the Bayesian optimization cost-sensitive feature selection (BOCFS) can select a low-cost subset of informative features, therefore generating highly effective classifiers, and achieving state-of-the-art performance in some datasets.