Why the naive bayesian classifier for clinical diagnostics or monitoring can dominate the proper one even for massive data sets

Hans J. Lenz

Conference Proceedings

Why the naive bayesian classifier for clinical diagnostics or monitoring can dominate the proper one even for massive data sets

Lenz H

Frontiers in Statistical Quality Control 10 (2015) 11 385-393

DOI: 10.1007/978-3-319-12355-4_23

1Citations

4Readers

Get full text

Abstract

We explain the phenomenon that the naive Bayesian classifier may dominate the proper one as happened in clinical studies, cf. Gammerman and Thatcher (Methods of Information in Medicine, 30, 15-22, 1991). Today this effect may be of concern for real-time health care monitoring or surveillance. The reason for the dominance relation lies in a mix of an a-priori not fixed dimension of the state-space (symptom space) given a disease, the feature selection procedure and the parameter estimation. Estimating conditional probabilities in high dimensions when using a proper Bayesian model can lead to an "over fitting," a missing value problem, and, consequently, to a loss of classification accuracy. Due to the "Curse of dimension" the degradation may not even be compensated by big data sets.

Author supplied keywords

Cite

CITATION STYLE

APA

Lenz, H. J. (2015). Why the naive bayesian classifier for clinical diagnostics or monitoring can dominate the proper one even for massive data sets. In Frontiers in Statistical Quality Control 10 (Vol. 11, pp. 385–393). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-319-12355-4_23

Why the naive bayesian classifier for clinical diagnostics or monitoring can dominate the proper one even for massive data sets

Abstract

Author supplied keywords

Cite

Register to see more suggestions