Eating sound dataset for 20 food types and sound classification using convolutional neural networks

Jeannette Shijie Ma; Marcello A. Gómez Maureira; Jan N. Van Rijn

Conference ProceedingsOPEN ACCESS

Eating sound dataset for 20 food types and sound classification using convolutional neural networks

ICMI 2020 Companion - Companion Publication of the 2020 International Conference on Multimodal Interaction (2020) 348-351

DOI: 10.1145/3395035.3425656

3Citations

11Readers

Get full text

Abstract

Food identification technology potentially benefits both food and media industries, and can enrich human-computer interaction. We assembled a food classification dataset consisting of 11,141 clips, based on YouTube videos of 20 food types. This dataset is freely available on Kaggle. We suggest the grouped holdout evaluation protocol as evaluation method to assess model performance. As a first approach, we applied Convolutional Neural Networks on this dataset. When applying an evaluation protocol based on grouped holdout, the model obtained an accuracy of 18.5%, whereas when applying an evaluation protocol based on uniform holdout, the model obtained an accuracy of 37.58%. When approaching this as a binary classification task, the model performed well for most pairs. In both settings, the method clearly outperformed reasonable baselines. We found that besides texture properties, eating action differences are important consideration for data driven eating sound researches. Protocols based on biting sound are limited to textural classification and less heuristic while assembling food differences.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Ma, J. S., Gómez Maureira, M. A., & Van Rijn, J. N. (2020). Eating sound dataset for 20 food types and sound classification using convolutional neural networks. In ICMI 2020 Companion - Companion Publication of the 2020 International Conference on Multimodal Interaction (pp. 348–351). Association for Computing Machinery, Inc. https://doi.org/10.1145/3395035.3425656

Readers' Seniority

PhD / Post grad / Masters / Doc 7

70%

Researcher 2

20%

Professor / Associate Prof. 1

10%

Readers' Discipline

Engineering 5

56%

Design 2

22%

Computer Science 1

11%

Arts and Humanities 1

11%

Eating sound dataset for 20 food types and sound classification using convolutional neural networks

Abstract

Author supplied keywords

References Powered by Scopus

CNN architectures for large-scale audio classification

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

Environmental sound classification with convolutional neural networks

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline