Parse decoration of the word sequence in the speech-to-text machine-translation pipeline

  • Kahn J
  • 4


    Mendeley users who have this article in their library.
  • N/A


    Citations of this article.


Parsing, or the extraction of syntactic structure from text, is appealing to natural language processing (NLP) engineers and researchers. Parsing provides an opportunity to consider information about word sequence and relatedness beyond simple adjacency. This dissertation uses automatically-derived syntactic structure (parse decoration) to improve the performance and evaluation of large-scale NLP systems that have (in general) used only word-sequence level measures to quantify success. In particular, this work focuses on parse structure in the context of large-vocabulary automatic speech recognition (ASR) and statistical machine translation (SMT) in English and (in translation) Mandarin Chinese. The research here explores three characteristics of statistical syntactic parsing: dependency structure, constituent structure, and parse-uncertainty - making use of the parser's ability to generate an M-best list of parse hypotheses. Parse structure predictions are applied to ASR to improve word-error rate over a baseline non-syntactic (sequence-only) language model (achieving 6-13% of possible error reduction). Critical to this success is the joint reranking of an N M-best list of N ASR hypothesis transcripts and M-best parse hypotheses (for each transcript). Jointly reranking the N xM lists is also demonstrated to be useful in choosing a high-quality parse from these transcriptions. In SMT, this work demonstrates expected dependency pair match (EDPM), a new mechanism for evaluating the quality of SMT translation hypotheses by comparing them to reference translations. EDPM, which makes direct use of parse dependency structure directly in its measurement, is demonstrated to be superior in correlation with human measurements of translation quality to the competitor (and widely-used) evaluation metrics BLEU4 and translation edit rate. Finally, this work explores how syntactic constituents may predict or improve the behavior of unsupervised word-aligners, a core component of SMT systems, over a collection of Chinese-English parallel text with reference alignment labels. Statistical word-alignment is improved over several machine-generated alignments by exploiting the coherence of certain parse constituent structures to identify source-language regions where a high-recall aligner may be trusted. These diverse results across ASR and SMT point together to the utility of including parse information into large-scale (and generally word-sequence oriented) NLP systems and demonstrate several approaches for doing so.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

There are no full text links


  • Jeremy Gillmor Kahn

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free