NLP on spoken documents without ASR

  • Dredze M
  • Jansen A
  • Coppersmith G
 et al. 
  • 43


    Mendeley users who have this article in their library.
  • 51


    Citations of this article.


There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and information retrieval. Many of these boxes, especially ASR, are often based on considerable linguistic resources. We would like to be able to process spoken documents with few (if any) resources. Moreover, connecting black boxes in series tends to multiply errors, especially when the key terms are out-of-vocabulary (OOV). The proposed alternative applies text processing directly to the speech without a dependency on ASR. The method finds long (~ 1 sec) repetitions in speech, and clusters them into pseudo-terms (roughly phrases). Document clustering and classification work surprisingly well on pseudo-terms; performance on a Switchboard task approaches a baseline using gold standard manual transcriptions.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

  • ISBN: 1932432868
  • SCOPUS: 2-s2.0-80053245896
  • PUI: 362643522
  • SGR: 80053245896


  • Mark Dredze

  • Aren Jansen

  • Glen Coppersmith

  • Ken Church

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free