Evolving an algorithm to generate sparse inverted index using hadoop and pig

Sonam Sharma; Shailendra Singh

Conference Proceedings

Evolving an algorithm to generate sparse inverted index using hadoop and pig

Smart Innovation, Systems and Technologies (2016) 51 499-508

DOI: 10.1007/978-3-319-30927-9_49

0Citations

1Readers

Get full text

Abstract

Now a day’s users mostly prefer the keyword search method to access the data for the explosion of information. Inverted indexing efficiently plays a very important role for search operation over a large set of data. There are two problems exist in current keyword based searching technique. First, the large set of data is mostly unstructured and does not suite in the existing database systems. Second, the storage in inverted indexing is usually very large and compression techniques used so far is also not so efficient because they increase the processing time. To overcome these problems, Hadoop, which is a distributed framework for large dataset is needed where the required resources could be shared and accessed very easily. In our proposed work, we will join the list of consecutive document id in the inverted index into the intervals to save memory space. For this, we have developed the UDF (User Defined Function) for stemming and stop words for the sparse inverted index in pig latin. It can be observed in the results that our proposed method is efficient than existing techniques.

Author supplied keywords

Cite

CITATION STYLE

APA

Sharma, S., & Singh, S. (2016). Evolving an algorithm to generate sparse inverted index using hadoop and pig. In Smart Innovation, Systems and Technologies (Vol. 51, pp. 499–508). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-30927-9_49

Evolving an algorithm to generate sparse inverted index using hadoop and pig

Abstract

Author supplied keywords

Cite

Register to see more suggestions