Mining and YouTube Data Analysis using Hadoop

  • Maheswari* B
  • et al.
N/ACitations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Analysis of structured and consistent data has seen remarkable success in past decades. Whereas, the analysis of unstructured data in the form of multimedia format remains a challenging task. YouTube is one of the most popular and used social media tool. It reveals the community feedback through comments for published videos, number of likes, dislikes, number of subscribers for a particular channel. The main objective of this work is to demonstrate by using Hadoop concepts, how data generated from YouTube can be mined and utilized to make targeted, real time and informed decisions. In our paper, we analyze the data to identify the top categories in which the most number of videos are uploaded. This YouTube data is publicly available and the YouTube data set is described below under the heading Data Set Description. The dataset will be fetched from the Google using the YouTube API (Application Programming Interface) and going to be stored in Hadoop Distributed File System (HDFS). Using MapReduce we are going to analyze the dataset to identify the video categories in which most number of videos are uploaded. The objective of this paper is to demonstrate Apache Hadoop framework concepts and how to make targeted, real-time and informed decisions using data gathered from YouTube.

Cite

CITATION STYLE

APA

Maheswari*, B. U., & Mythili, N. (2020). Mining and YouTube Data Analysis using Hadoop. International Journal of Innovative Technology and Exploring Engineering, 9(3), 1461–1465. https://doi.org/10.35940/ijitee.b7922.019320

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free