If we did not have imagenet: Comparison of fisher encodings and convolutional neural networks on limited training data

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This work aims to compare two competing approaches for image classification, namely Bag-of-Visual-Words (BoVW) and Convolutional Neural Networks (CNNs). Recent works have shown that CNNs (Convolutional Neural Networks) have surpassed hand-crafted feature extraction techniques in image classification problems. Their success is partly attributed to the fact that benchmarking initiatives such as ImageNet in a massive crowd sourcing effort gathered sufficient data necessary to train deep neural networks with a very large number of model parameters. Obviously, manually annotated training datasets on a similar scale cannot be provided in every classification scenario due to the massive amount of required resources and time. In this paper, we therefore analyze and compare the performance of BoVW- and CNN-based approaches for image classification as a function of the available training data. We show that CNNs benefit from growing datasets while BoVWbased classifiers outperform CNNs when only limited data is available. Evidence is given by experiments with gradually increasing training data and visualizations of the classification models.

Cite

CITATION STYLE

APA

Hentschel, C., Wiradarma, T. P., & Sack, H. (2015). If we did not have imagenet: Comparison of fisher encodings and convolutional neural networks on limited training data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9475, pp. 400–409). Springer Verlag. https://doi.org/10.1007/978-3-319-27863-6_37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free