Visual search target inference using bag of deep visual words

Sven Stauden; Michael Barz; Daniel Sonntag

Conference Proceedings

Visual search target inference using bag of deep visual words

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11117 LNAI 297-304

DOI: 10.1007/978-3-030-00111-7_25

N/ACitations

6Readers

Get full text

Abstract

Visual Search target inference subsumes methods for predicting the target object through eye tracking. A person intents to find an object in a visual scene which we predict based on the fixation behavior. Knowing about the search target can improve intelligent user interaction. In this work, we implement a new feature encoding, the Bag of Deep Visual Words, for search target inference using a pre-trained convolutional neural network (CNN). Our work is based on a recent approach from the literature that uses Bag of Visual Words, common in computer vision applications. We evaluate our method using a gold standard dataset. The results show that our new feature encoding outperforms the baseline from the literature, in particular, when excluding fixations on the target.

Author supplied keywords

Cite

CITATION STYLE

APA

Stauden, S., Barz, M., & Sonntag, D. (2018). Visual search target inference using bag of deep visual words. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11117 LNAI, pp. 297–304). Springer Verlag. https://doi.org/10.1007/978-3-030-00111-7_25

Visual search target inference using bag of deep visual words

Abstract

Author supplied keywords

Cite

Register to see more suggestions