Smoothing categorical data

Arno Siebes; René Kersten

Conference ProceedingsOPEN ACCESS

Smoothing categorical data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7523 LNAI(PART 1) 42-57

DOI: 10.1007/978-3-642-33460-3_8

3Citations

12Readers

Abstract

Global models of a dataset reflect not only the large scale structure of the data distribution, they also reflect small(er) scale structure. Hence, if one wants to see the large scale structure, one should somehow subtract this smaller scale structure from the model. While for some kinds of model - such as boosted classifiers - it is easy to see the "important" components, for many kind of models this is far harder, if at all possible. In such cases one might try an implicit approach: simplify the data distribution without changing the large scale structure. That is, one might first smooth the local structure out of the dataset. Then induce a new model from this smoothed dataset. This new model should now reflect the large scale structure of the original dataset. In this paper we propose such a smoothing for categorical data and for one particular type of models, viz., code tables. By experiments we show that our approach preserves the large scale structure of a dataset well. That is, the smoothed dataset is simpler while the original and smoothed datasets share the same large scale structure. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Siebes, A., & Kersten, R. (2012). Smoothing categorical data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7523 LNAI, pp. 42–57). https://doi.org/10.1007/978-3-642-33460-3_8

Smoothing categorical data

Abstract

Cite

Register to see more suggestions