This paper assesses the performance of frequency and concept based text representation in Mixed Script Information Retrieval and Classification tasks. In text analytics, representation serves as an unresolved research problem to progress further towards different applications. In this paper observations from different text representation methods in text classification and information retrieval are presented. The data set from the Mixed Script Information Retrieval shared task is used in this experiment and the performance of final submitted model is evaluated by task organizers. It is observed that distributional representation performs better than the frequency based text representation methods. The final system attained first place in task 2 and was 3.89% lesser than the top scored system in task 1.
CITATION STYLE
Barathi Ganesh, H. B., Anand Kumar, M., & Soman, K. P. (2018). From Vector Space Models to Vector Space Models of Semantics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10478 LNCS, pp. 50–60). Springer Verlag. https://doi.org/10.1007/978-3-319-73606-8_4
Mendeley helps you to discover research relevant for your work.