A method for photograph indexing using speech annotation

Jiayi Chen; Tele Tan; Philippe Mulhem

Conference Proceedings

A method for photograph indexing using speech annotation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2195 867-872

DOI: 10.1007/3-540-45453-5_113

6Citations

2Readers

Get full text

Abstract

We explore the feasibility of using speech input to perform the task of indexing a large volume of digital photographs. As a natural medium for image communication, speech can be used to complement existing content-based techniques thereby promoting the reliability and use-ability of image retrieval systems. We introduce a methodology for image indexing using speech annotation technique. Speech recognition tools, like Dragon Naturally Speaking can be adapted to perform the main role of speech-to-text transcription. The use of structured speech as opposed to free form speech in a limited system can further boost the transcription accuracy. We also introduce the idea of using N-best lists from the speech recognition output to improve the recognition performance. The transcribed text is used to populate the metadata of the corresponding photograph. A photo query strategy is implemented to affirm the performance of proposed technique for photo indexing and retrieval.

Cite

CITATION STYLE

APA

Chen, J., Tan, T., & Mulhem, P. (2001). A method for photograph indexing using speech annotation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2195, pp. 867–872). Springer Verlag. https://doi.org/10.1007/3-540-45453-5_113

A method for photograph indexing using speech annotation

Abstract

Cite

Register to see more suggestions