Audio tagging system using deep learning model

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deep learning has been getting more attention towards the researchers for transforming input data into an effective representation through various learning algorithms. Hence it requires a large and variety of datasets to ensure good performance and generalization. But manually labeling a dataset is really a time consuming and expensive process, limiting its size. Some of websites like YouTube and Freesound etc. provide large volume of audio data along with their metadata. General purpose audio tagging is one of the newly proposed tasks in DCASE that can give valuable insights into classification of various acoustic sound events. The proposed work analyzes a large scale imbalanced audio data for a audio tagging system. The baseline of the proposed audio tagging system is based on Convolutional Neural Network with Mel Frequency Cepstral Coefficients. Audio tagging system is developed with Google Colaboratory on free Telsa K80 GPU using keras, Tensorflow, and PyTorch. The experimental result shows the performance of proposed audio tagging system with an average mean precision of 0.92.

Cite

CITATION STYLE

APA

Sophiya, E., & Jothilakshmi, S. (2019). Audio tagging system using deep learning model. International Journal of Innovative Technology and Exploring Engineering, 8(10), 1949–1957. https://doi.org/10.35940/ijitee.J9281.0881019

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free