We propose a hierarchical network model based on predictive coding and reservoir computing as a model of multi-modal sensory integration in the brain. The network is composed of visual, auditory, and integration areas. In each area, the dynamical reservoir acts as a generative model that reproduces the time-varying sensory signal. The states of the visual and auditory reservoir are spatially compressed and are sent to the integration area. We evaluate the model with a dataset of time courses, including a pair of visual (hand-written characters) and auditory (read utterances) signal. We show that the model learns the association of multiple modalities of the sensory signals and that the model reconstructs the visual signal from a given corresponding auditory signal. Our approach presents a novel dynamical mechanism of the multi-modal information processing in the brain and the fundamental technology for a brain like an artificial intelligence system.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Yonemura, Y., & Katori, Y. (2021). Network model of predictive coding based on reservoir computing for multi-modal processing of visual and auditory signals. Nonlinear Theory and Its Applications, IEICE, 12(2), 143–156. https://doi.org/10.1587/nolta.12.143