Abstract
This work addresses the challenging task of text categorization. The main goal is the comparison of two different approaches, i.e. Vector Space Model and ontology-based solutions. The authors compare and contrast them with respect to accuracy and processing flow, which affect the classification results. The ontology-based method outperforms its counter-part when it comes to category resolution, i.e. the number of categories which can be processed. On the other hand, the SVM-based module is much faster and performs well when trained on an appropriately-structured learning set. The authors performed a series of tests to compare the methods and, as expected, the ontology-based solution outperformed the SVM classifier. It reached a micro averaged F1-score of 0.90 with 2.8 million Wikipedia articles, whereas the SVM-based module did not exceed 0.86 with the same data set. The macro averaged F1-score of both solutions was inferior to the micro one and reached values of 0.75 and 0.57, for ontology and SVM-based solutions respectively.
Author supplied keywords
Cite
CITATION STYLE
Wróbel, K., Wielgosz, M., Smywiłński-Pohl, A., & Pietron, M. (2016). Comparison of SVM and ontology-based text classification methods. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9692, pp. 667–680). Springer Verlag. https://doi.org/10.1007/978-3-319-39378-0_57
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.