Keyword Extraction Based Summarization of Categorized Kannada Text Documents

undefined Jayashree; undefined Srikanta Murthy; K Sunny

Journal ArticleOPEN ACCESS

Keyword Extraction Based Summarization of Categorized Kannada Text Documents

Jayashree
Srikanta Murthy
Sunny K

International Journal on Soft Computing (2011) 2(4) 81-93

DOI: 10.5121/ijsc.2011.2408

N/ACitations

9Readers

Abstract

The internet has caused a humongous growth in the number of documents available online. Summaries of documents can help find the right information and are particularly effective when the document base is very large. Keywords are closely associated to a document as they reflect the document's content and act as indices for a given document. In this work, we present a method to produce extractive summaries of documents in the Kannada language, given number of sentences as limitation. The algorithm extracts key words from pre-categorized Kannada documents collected from online resources. We use two feature selection techniques for obtaining features from documents, then we combine scores obtained by GSS (Galavotti, Sebastiani, Simi) coefficients and IDF (Inverse Document Frequency) methods along with TF (Term Frequency) for extracting key words and later use these for summarization based on rank of the sentence. In the current implementation, a document from a given category is selected from our database and depending on the number of sentences given by the user, a summary is generated.

Cite

CITATION STYLE

APA

Jayashree, Srikanta Murthy, & Sunny, K. (2011). Keyword Extraction Based Summarization of Categorized Kannada Text Documents. International Journal on Soft Computing, 2(4), 81–93. https://doi.org/10.5121/ijsc.2011.2408

Keyword Extraction Based Summarization of Categorized Kannada Text Documents

Abstract

Cite

Register to see more suggestions