It has been suggested that combining content-based indexing with automatically generated temporal metadata might help improve search and browsing of recordings of computer-mediated collaborative activities such as on-line meetings, which are characterised by extensive multimodal communication. This paper presents an analytical evaluation of the effectiveness of these techniques as implemented through automatic speech recognition and temporal mapping. In particular, it assesses the extent to which this strategy can help uncover contextual relationships between audio and text segments in recorded remote meetings. Results show that even simple temporal mapping can effectively support retrieval of recorded audio segments, improve retrieval performance in situations where speech recognition alone would have exhibited prohibitively high word error rates, and provide a basic form of semantic adaptation.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below