This paper presents a unified multimedia classification approach that integrates effectively visual and textual features. It combines the Bag of Visual Words model (BoVW) together with a generalized Bag of Colors (BoC) model and textual information in an early stage for modality detection of images in the medical domain. Our contribution is twofold: First we generalize the BoC model incorporating spatial information derived from a quad-tree decomposition of the images. Second we propose a weighted linear combination of word embeddings for the textual representation of the images. Experimental results conducted on the data of the ImageCLEF contest for the years 2011, 2012, 2013 and 2016 demonstrate the effectiveness and robustness of our framework in terms of classification accuracy outperforming all the published results so far on the aforementioned datasets.
CITATION STYLE
Valavanis, L., Stathopoulos, S., & Kalamboukis, T. (2017). Fusion of bag-of-words models for image classification in the medical domain. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10193 LNCS, pp. 134–145). Springer Verlag. https://doi.org/10.1007/978-3-319-56608-5_11
Mendeley helps you to discover research relevant for your work.