Abstract
Typically, the output of Automatic Speech Recognition (ASR) is a mere sequence of words. This view may be sufficient for some tasks, whereas others require a more structured approach. This thesis presents a framework that allows for identification of deep, underlying structure in report dictations. Identification of structural elements, such as headings, sections and enumerations is an important step towards automatic post-processing of dictated speech. The contributions of this thesis include a generic approach that can be integrated seamlessly with existing ASR solutions and provides structured output, as well as a freely available Conditional Random Field (CRF) toolkit that forms the basis of the aforementioned approach and may also be applicable to numerous other problems.
Author supplied keywords
Cite
CITATION STYLE
Jancsary, J. (2008). Recognizing Structure in Report Transcripts.
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.