Abstract
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models. Diffusion-based methods have recently shown promise in analyzing functional magnetic resonance imaging (fMRI) data, including the reconstruction of high-quality images consistent with original visual stimuli. Nonetheless, it remains a critical challenge to effectively harness the semantic and silhouette information extracted from brain signals. In this paper, we propose a novel approach, termed as Controllable Mind Visual Diffusion Model (CMVDM). Specifically, CMVDM first extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks. Then, a control model is introduced in conjunction with a residual block to fully exploit the extracted information for image synthesis, generating high-quality images that closely resemble the original visual stimuli in both semantic content and silhouette characteristics. Through extensive experimentation, we demonstrate that CMVDM outperforms existing state-of-the-art methods both qualitatively and quantitatively. Our code is available at https://github.com/zengbohan0217/CMVDM.
Cite
CITATION STYLE
Zeng, B., Li, S., Liu, X., Gao, S., Jiang, X., Tang, X., … Zhang, B. (2024). Controllable Mind Visual Diffusion Model. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, pp. 6935–6943). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v38i7.28519
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.