In this paper we aim to address the problem of saliency detection on RGB-D image pairs based on a multi-stream late fusion network. With the prevalence of RGB-D sensors, leveraging additional depth information to facilitate saliency detection task has drawn increasing attention. However, the key challenge that how to fuse RGB data and depth data in an optimum manner is still under-studied. Conventional wisdom simply regards depth information as an undifferentiated channel and models RGB-D saliency detection by using existing RGB saliency detection models directly. However, this paradigm is incapable of capturing specific representations in depth modality and also powerless in fusing multi-modal information. In this paper, we address this problem by proposing a simple yet principled late fusion strategy carried out in conjunction with convolutional neural networks (CNNs). The proposed network is able to learn discriminant representations and explore the complementarity between RGB and depth modalities. Comprehensive experiments on two public datasets witness the benefits of the proposed RGB-D saliency detection network.
CITATION STYLE
Chen, H., Li, Y., & Su, D. (2017). RGB-D saliency detection by multi-stream late fusion network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10528 LNCS, pp. 459–468). Springer Verlag. https://doi.org/10.1007/978-3-319-68345-4_41
Mendeley helps you to discover research relevant for your work.