Sign up & Download
Sign in

Multimodal Human-Robot Interaction Framework for a Personal Robot

by Javi Gorostiza, Ramon Barber, Alaa Khamis, Maria Malfaz, Rakel Pacheco, Rafael Rivas, Ana Corrales, Elena Delgado, Miguel Salichs show all authors
ROMAN 2006 The 15th IEEE International Symposium on Robot and Human Interactive Communication (2006)

Abstract

This paper presents a framework for multimodal human-robot interaction. The proposed framework is being implemented in a personal robot called Maggie, developed at RoboticsLab of the University Carlos III of Madrid for social interaction research. The control architecture of this personal robot is a hybrid control architecture called AD (automatic-deliberative) that incorporates an emotion control system (ECS). Maggie's main goal is to interact establish a peer-to-peer relationship with humans. To achieve this goal, a set of human-robot interaction skills are developed based on the proposed framework. The human-robot interaction skills imply tactile, visual, remote voice and sound modes. The multi-modal fusion and synchronization are also presented in this paper

Cite this document (BETA)

Available from ieeexplore.ieee.org
Page 1
hidden

Multimodal Human-Robot Interaction Framework for a Personal Robot

Multimodal Human-Robot Interaction Framework for a Personal Robot
Javi F. Gorostiza, Ramo´n Barber, Alaa M. Khamis, Marı´a Malfaz
Rakel Pacheco, Rafael Rivas, Ana Corrales, Elena Delgado and Miguel A. Salichs
Abstract—This paper presents a framework for multimodal
human-robot interaction. The proposed framework is being
implemented in a personal robot called Maggie, developed
at RoboticsLab of the University Carlos III of Madrid for
social interaction research. The control architecture of this
personal robot is a hybrid control architecture called AD
(Automatic-Deliberative) that incorporates an Emotion Control
System (ECS) Maggie’s main goal is to interact in a natural
way and establish a peer-to-peer relationship with humans. To
achieve this goal, a set of human-robot interaction skills are
developed based on the proposed framework. The human-robot
interaction skills imply tactile, visual, remote voice and sound
modes. The multi-modal fusion and synchronization are also
presented in this paper.
I. INTRODUCTION
In recent years, human-robot social interaction has at-
tracted considerable attention by the academic and the re-
search communities. A social robot [1] has attitudes or
behaviors that take the interests, intentions or needs of the
humans into account. This robot must be able to interact
with humans by following the social rules attached to its role.
Bartneck and Forlizzi define a social robot as an autonomous
or semiautonomous robot that interacts and communicates
with humans by following the behavioral norms expected
by the people with whom the robot is intended to interact
[2], page 2. The multimodality is considered as a main and
an unquestionable feature of human-robot social interaction.
Multimodality means providing the user with more than a
single mode of interaction. Multimodal interfaces allow users
to move seamlessly between different modes of interaction,
from visual to voice to touch, according to changes in context
or user preference. These interfaces have the advantage of
increased usability and accessibility. Usability determines the
overall utility of the system. It also determines the extent
to which an interface supports its users in completing their
tasks efficiently, effectively, and satisfactorily. In multimodal
interfaces, the weaknesses of one modality can be offset by
the strengths of another. For example, a person can order
his/her personal assistant robot whose multimodal interface
using gesture-based interface in a noisy environment where
verbal communication can not be worked efficiently. Ac-
cessibility determines how easy it is for people to interact
with the robot. The multimodal interfaces provide increased
accessibility. For example, visually impaired users can rely
on the voice modality while hearing-impaired users can use
the visual modality. Multimodality implies the problem of
integration and synchronization of different communication
modalities both in perception and expression. Many robotic
platforms have been built with different design considera-
tions, control architectures and capabilities to study human-
robot social interaction. Kismet [3] and Leonardo [4] devel-
oped at MIT have an emotional reactive control architecture
that integrates the visual and audio modes. While Kismet is
able to react to human voice and movements simulating an
infant behavior, Leonardo can detect gaze and non-verbal
signal like turn taking in collaborative tasks. HERMES,
an experimental robot of anthropomorphic size and shape
developed at Bunderswehr University of Munich. This robot
has hybrid control architecture (deliberative and reactive)
and integrates remote mode (via internet), dialogs handling
Natural Language Processing (NLP) and combination of
vision and touch during the tasks of giving and taking objects
[5]. Robonaut is a joint DARPANASA project designed
to create a humanoid robot equivalent to humans during
space walks activity. This robot is equipped with human-
like hands and television camera eyes and has the option
of rolling around Earth. This robot has been designed to
assist astronauts in extra-vehicular activities. To do so, the
robot is capable to handle natural language dialogs. Its
control Architecture allows enhanced skills like perspective
taking. Two models of human perspective taking are used in
this architecture: jACT-R/S based on human representation
models and Polyscheme based on human reasoning process
[6]. Robovie is a humanoid robot that can communicate with
humans and is designed to participate in human society as a
partner [7]. This robot works by means of a behavior-based
architecture. It responds to tactile events with predefined
simple behaviors. Sparky is a social robot that uses both fa-
cial expressions and movements to interact with humans [8].
Rubi is another anthropomorphic robot with a head and arms
designed for research on real-time social interaction between
robots and humans [9]. Robota is a sophisticated educational
toy robot designed to build human-robot social interactions
with children with motor and cognitive disabilities [10]. In
the Lino project, a robot head with a nice, cute appearance
and emotional feedback can be configured in such a way that
the human user enjoys the interaction and will more easily
accept possible misunderstandings [11]. Other social robot
designs rely on computer graphics and animation techniques.
Vikia [12], Valerie Roboceptionist [13], Grace (Graduate
robot Attending a ConferencE) or George [14] are some
examples for computer graphic-based social robots. They all
handle natural language dialogs and merge image animation
with speech. All these projects pretend to develop robots that
function more naturally and can be considered as partners
for the human not just as mere tools. These robots need to
interact with human (and perhaps with each other) through

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

12 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
50% Ph.D. Student
 
17% Student (Master)
 
8% Doctoral Student
by Country
 
17% Spain
 
17% Germany
 
8% Netherlands