As there are a lot of available documents in the Internet, it is impossible to manually extract their important information. In this paper, we propose a system for extracting important information automatically from huge volume of documents using word correlation analysis. Our system analyzes words' occurrence and co-occurrence frequencies on several levels: sentence, paragraph, and document. And then, it performs three different analysis steps: occurrence frequency, adjacent correlation, and importance score analysis, to calculate the importance score of each word. Finally, it can extract keywords and store them in a graph structure. The benefits of using a graph structure were twofold. We could effectively manage the keywords and their connections; and it assisted us with the retrieval of relevant documents. Our preliminary experiment shows that our technique can be used for analyzing large set of documents well. © 2014 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Kusmawan, P. Y., & Kwon, J. (2014). Graph summarization using word correlation analysis on large set of documents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8505 LNCS, pp. 61–74). Springer Verlag. https://doi.org/10.1007/978-3-662-43984-5_5
Mendeley helps you to discover research relevant for your work.