IndoAcro is an Indonesian acronym and expansion repository created using machine learning and big data technology. The repository can be publicly accessed from www.indoacro.cs.unsyiah.ac.id. Six important steps of IndoAcro have been developed and implemented, which consists of (1) data crawling, (2) data cleaning, (3) generating candidate pairs of acronym and expansion, (4) generating numerical features, (5) classifying the candidate pairs, and (6) filtering the classification results. In this study, we introduce and analyze the implementation of data auto-update for IndoAcro. Since it was developed, IndoAcro has 2,232 pairs of acronym and expansion, collected from more than 50 thousand online news articles. Because no auto-update approach has been implemented previously, the number of acronym and expansion pairs in the database is monotonous, dull, and static. In this study, we introduce and analyze the implementation of data auto-update for IndoAcro. We have analyzed and evaluated the data auto-update process for 180 days, each process consists of 2 days interval. We found that the data auto-update approach has successfully implemented and updated the data for IndoAcro. We collected 1,639 pairs of acronym and expansion in the first run, 343 and 224 pairs in the second and third runs.
CITATION STYLE
Abidin, T. F., Ferdhiana, R., Iqbal, M., Syaputra, D., Putera, T. W. A., & Aksana, M. Z. (2020). IndoAcro: An Indonesian Acronym and Expansion Repository with Data Auto-Update Implementation. In Journal of Physics: Conference Series (Vol. 1566). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1566/1/012100
Mendeley helps you to discover research relevant for your work.