This data-driven project is systematically contributing on enhancing the conflict-violence or disaster-related displacement within an internationally recognized state border, namely internal displacement. With the availability of a training set with pre-defined categories, the project tackles document classification and information retrieval applications through supervised machine learning. This research can be divided into three core objectives. Firstly to eradicate non-relevant documents by filtrating documents not in English and not providing information on human mobility related to internal displacement. Secondly, to tag documents relatively to the themes Internal Displacement Monitoring Centre (IDMC) used to monitor the causes behind internal displacement, notably conflict/violence or disasters. Thirdly, to extract vital displacement information reported in online sources, such as location, displacement figures, etc. Documents are further analysed by training them using Support Vector Machine for tagging and Multinomial Naïve Bayes for information extraction, added to the pre-processing operations such as mainly working on natural language processing annotators, since the training set is mainly composed of textual documents. Finally, after having adjusted the parameters and learning, the performance of each of the resulting functions, notably Support Vector Machine and Multinomial Naïve Bayes on the training set, were measured on two different test sets, one for tagging and the other for information retrieval. By evaluating the provided dataset, the results were good with a result of 95.83% for classification and 81% for information retrieval.
CITATION STYLE
Mahamoud, H. F., Ponnusamy, R. R., Kang, H. M., & You, J. S. T. (2018). Improving the process of identifying internally displaced persons using big data technologies. International Journal of Innovative Technology and Exploring Engineering, 8(2 Special Issue 2), 386–391.
Mendeley helps you to discover research relevant for your work.