Classification is one of the most fundamental tasks in data mining and machine learning. It is being applied in an increasing number of fields, e.g. filtering, identification, information retrieval, information extraction, and similarity detection. A basic and necessary condition for the success of a classification task is the proper representation of the information it wishes to classify. Classification is needed in domains that are based on uni-modal representations such as text, images, audio, and speech, as well as in domains that are based on multi-modal representations. This paper aims to provide a short review on the developing area of multi-modal representations for classification with emphasis on state-of-the-art systems in this area. Firstly, fundamentals of uni-modal representations are given. Secondly, an overview of multi-modal representations is given. Thirdly, various related systems using multi-modal representations and the datasets used by them are briefly summarized with a comparative summary of these systems.
CITATION STYLE
Wiesen, A., & HaCohen-Kerner, Y. (2018). Overview of uni-modal and multi-modal representations for classification tasks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 397–404). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_41
Mendeley helps you to discover research relevant for your work.