Automatically understanding and modeling a user’s likingfor an image is a challenging problem. This is because the relationshipbetween the images features (even semantic ones extracted by existingtools, viz. faces, objects etc.) and users’ ‘likes’ is non-linear, influenced by several subtle factors. This work presents a deep bi-modal knowledge representation of images based on their visual content and associated tags(text). A mapping step between the different levels of visual and textual representations allows for the transfer of semantic knowledge between the two modalities. It also includes feature selection before learning deep representation to identify the important features for a user to like an image. Then the proposed representation is shown to be effective in learning a model of users image ‘likes’ based on a collection of images ‘liked’ by him. On a collection of images ‘liked’ by users (from Flickr) the proposed deep representation is shown to better state-of-art low-level features used for modeling user ‘likes’ by around 15–20 %.
CITATION STYLE
Guntuku, S. C., Zhou, J. T., Roy, S., Weisi, L., & Tsang, I. W. (2015). Deep representations to model user ‘Likes.’ In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9003, pp. 3–18). Springer Verlag. https://doi.org/10.1007/978-3-319-16865-4_1
Mendeley helps you to discover research relevant for your work.