A Model for Multimodal Representation and Inference

  • Pineda L
  • Garza G
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper some applications of a theory for representation and inference in multimodal scenarios is presented. The theory is focused on the relation between natural language and graphical expressions. A basic assumption is that graphical expressions belong to a language with well-defined syntax and semantics: a graphical language. A second assumption is that the relation between expressions of different modalities is similar to the relation of translation that holds between expressions of different natural languages. In this paper a multimodal system of representation and inference based on this view of modality is described. First, a brief introduction to the representational structures of the multimodal system is presented. Then, a number of multimodal inferences supported by the system are illustrated. These examples show how the multimodal system of representation can support the definition and use of graphical languages, perceptual inferences for problem-solving and interpretation of multimodal messages. Finally, the intuitive notion of modality underlying this research is discussed. 1. Multimodal Representation The system of multimodal representation that is summarized in this paper is illustrated in Figure 1. The notion of modality in which the system is based is a representational notion: information conveyed in one particular modality is expressed in a representational language associated with the modality. Each modality in the system is captured through a particular language, and relations between expressions of different modalities are captured in terms of translation functions from basic and composite expressions of the source modality into expressions of the object modality. This view of multimodal representation and reasoning has been developed in [13], [17], [9], [18] and [19], and it follows closely the spirit of Montague's general semiotic programme [5]. The theory is targeted to define natural language and graphical interactive computer systems and, as a consequence, the model is focused in these two modalities. However, the system is also used to express conceptual information in a logical fashion and, depending on the application, the circle labeled L might stand for first-order logic or any other symbolic language as long as the syntax is well-defined and the language is given a model-theoretical semantic interpretation. The circles labeled L and G in Figure 1 stand for sets of expressions of the natural and graphical languages respectively, and the circle labeled P stands for the set of graphical symbols constituting the 1 To be publish also in "Visual Representations and Interpretations", Springer-Verlag, 1998.

Cite

CITATION STYLE

APA

Pineda, L., & Garza, G. (1999). A Model for Multimodal Representation and Inference. In Visual Representations and Interpretations (pp. 375–386). Springer London. https://doi.org/10.1007/978-1-4471-0563-3_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free