Interactive learning of spokenwords and their meanings through an audio-visual interface

Naoto Iwahashi

Journal ArticleOPEN ACCESS

Interactive learning of spokenwords and their meanings through an audio-visual interface

Iwahashi N

IEICE Transactions on Information and Systems (2008) E91-D(2) 312-321

DOI: 10.1093/ietisy/e91-d.2.312

14Citations

12Readers

Abstract

This paper presents a new interactive learning method for spoken word acquisition through human-machine audio-visual interfaces. During the course of learning, the machine makes a decision about whether an orally input word is a word in the lexicon the machine has learned, using both speech and visual cues. Learning is carried out on-line, incrementally, based on a combination of active and unsupervised learning principles. If the machine judges with a high degree of confidence that its decision is correct, it learns the statistical models of the word and a corresponding image category as its meaning in an unsupervised way. Otherwise, it asks the user a question in an active way. The function used to estimate the degree of confidence is also learned adaptively on-line. Experimental results show that the combination of active and unsupervised learning principles enables the machine and the user to adapt to each other, which makes the learning process more efficient. Copyright © 2008 The Institute of Electronics, Information and Communication Engineers.

Author supplied keywords

Cite

CITATION STYLE

APA

Iwahashi, N. (2008). Interactive learning of spokenwords and their meanings through an audio-visual interface. IEICE Transactions on Information and Systems, E91-D(2), 312–321. https://doi.org/10.1093/ietisy/e91-d.2.312

Interactive learning of spokenwords and their meanings through an audio-visual interface

Abstract

Author supplied keywords

Cite

Register to see more suggestions