Analysis of named-entity effect on text classification of traffic accident data using machine learning

1Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.

Abstract

With the rising number of accidents in Indonesia, it is still necessary to evaluate and analyze accident data. The categorization of traffic accident data has been developed using word embedding, however additional work is needed to achieve better results. Several informative named entities are frequently sufficient to differentiate whether or not information on a traffic accident exists. Named-entities are informational characteristics that can offer details about a text. The influence of named-entities on thematic text categorization is examined in this paper. The information was collected using a Twitter social media crawl. Preprocessing is done at the beginning of the process to modify and delete useful text as well as label specified entities. On support vector machine (SVM), scheme comparisons were performed for i) word embedding, ii) the number of occurrences of named entities, and iii) the combination of the two is known as a hybrid. The hybrid scheme produced an improvement in classification accuracy of 90.27% when compared to word embedding scheme and occurrences of named entities scheme, according to tests conducted using 1.885 data consisting of 788 accident data and 1.067 non-accident data.

Cite

CITATION STYLE

APA

Putra, A. D., & Girsang, A. S. (2022). Analysis of named-entity effect on text classification of traffic accident data using machine learning. Indonesian Journal of Electrical Engineering and Computer Science, 25(3), 1672–1678. https://doi.org/10.11591/ijeecs.v25.i3.pp1672-1678

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free