Utilizing local outlier factor for open-set classification in high-dimensional data - Case study applied for text documents

Tomasz Walkowiak; Szymon Datko; Henryk Maciejewski

Conference Proceedings

Utilizing local outlier factor for open-set classification in high-dimensional data - Case study applied for text documents

Advances in Intelligent Systems and Computing (2020) 1037 408-418

DOI: 10.1007/978-3-030-29516-5_33

1Citations

3Readers

Get full text

Abstract

In this paper, we focus on the utilization of Local Outlier Factor (LOF) algorithm in the task of performing open-set classification on high-dimensional data. Concerning the application on text documents, we research the fastText method for extraction of feature vectors. Then we build a classifier and evaluate its accuracy (precision, recall) on prepared test data, containing both subject categories known during training and completely new categories. Next we attempt to identify incorrect outcomes related to assigning the documents of new categories to one of the trained classes; for this we use the Local Outlier Factor algorithm. We show how decision function threshold in LOF influences the precision and recall of this open-set classification procedure.

Author supplied keywords

Cite

CITATION STYLE

APA

Walkowiak, T., Datko, S., & Maciejewski, H. (2020). Utilizing local outlier factor for open-set classification in high-dimensional data - Case study applied for text documents. In Advances in Intelligent Systems and Computing (Vol. 1037, pp. 408–418). Springer Verlag. https://doi.org/10.1007/978-3-030-29516-5_33

Utilizing local outlier factor for open-set classification in high-dimensional data - Case study applied for text documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions