Multimodal interactive systems enabling combination of natural modalities such as speech, touch, and gesture make it easier and more effective for users to interact with applications and services, whether on mobile devices, or in smart homes or cars. However, building these systems remains a complex and highly specialized task, in part because of the need to integrate multiple disparate and distributed system components. This task is further hindered by proprietary representations for input and output to different types of modality processing components such as speech recognizers, gesture recognizers, natural language understanding components and dialog managers. The W3C EMMA standard addresses this challenge and simplifies multimodal application authoring by providing a common representation language for capturing the interpretation of user inputs and system outputs and associated metadata. In this chapter, we describe the EMMA markup language and demonstrate its capabilities through presentation of a series of illustrative examples.
CITATION STYLE
Johnston, M. (2016). Extensible multimodal annotation for intelligent interactive systems. In Multimodal Interaction with W3C Standards: Toward Natural User Interfaces to Everything (pp. 37–64). Springer International Publishing. https://doi.org/10.1007/978-3-319-42816-1_3
Mendeley helps you to discover research relevant for your work.