A proposal of a big web data application and archive for the distributed data processing with Apache Hadoop

Martin Lnenicka; Jan Hovad; Jitka Komarkova

Conference Proceedings

A proposal of a big web data application and archive for the distributed data processing with Apache Hadoop

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9330 LNCS 285-294

DOI: 10.1007/978-3-319-24306-1_28

1Citations

6Readers

Get full text

Abstract

In recent years, research on big data, data storage and other topics that represent innovations in the analytics field has become very popular. This paper describes a proposal of a big web data application and archive for the distributed data processing with Apache Hadoop, including the framework with selected methods, which can be used with this platform. It proposes a workflow to create a web content mining application and a big data archive, which uses modern technologies like Python, PHP, JavaScript, MySQL and cloud services. It also shows the overview about the architecture, methods and data structures used in the context of web mining, distributed processing and big data analytics.

Author supplied keywords

Cite

CITATION STYLE

APA

Lnenicka, M., Hovad, J., & Komarkova, J. (2015). A proposal of a big web data application and archive for the distributed data processing with Apache Hadoop. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9330 LNCS, pp. 285–294). Springer Verlag. https://doi.org/10.1007/978-3-319-24306-1_28

A proposal of a big web data application and archive for the distributed data processing with Apache Hadoop

Abstract

Author supplied keywords

Cite

Register to see more suggestions