Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study

Néstor Barraza; Sérgio Moro; Marcelo Ferreyra; Adolfo de la Peña

Journal ArticleOPEN ACCESS

Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study

Journal of Information Science (2019) 45(1) 53-67

DOI: 10.1177/0165551518770967

39Citations

73Readers

Get full text

Abstract

Feature selection is a highly relevant task in any data-driven knowledge discovery project. The present research focuses on analysing the advantages and disadvantages of using mutual information (MI) and data-based sensitivity analysis (DSA) for feature selection in classification problems, by applying both to a bank telemarketing case. A logistic regression model is built on the tuned set of features identified by each of the two techniques as the most influencing set of features on the success of a telemarketing contact, in a total of 13 features for MI and 9 for DSA. The latter performs better for lower values of false positives while the former is slightly better for a higher false-positive ratio. Thus, MI becomes a better choice if the intention is reducing slightly the cost of contacts without risking losing a high number of successes. However, DSA achieved good prediction results with less features.

Author supplied keywords

Cite

CITATION STYLE

APA

Barraza, N., Moro, S., Ferreyra, M., & de la Peña, A. (2019). Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study. Journal of Information Science, 45(1), 53–67. https://doi.org/10.1177/0165551518770967

Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study

Abstract

Author supplied keywords

Cite

Register to see more suggestions