A Theory of Unsupervised Speech Recognition

Liming Wang; Mark Hasegawa-Johnson; Chang D. Yoo

Conference ProceedingsOPEN ACCESS

A Theory of Unsupervised Speech Recognition

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 1 1192-1215

DOI: 10.18653/v1/2023.acl-long.67

4Citations

17Readers

Abstract

Unsupervised speech recognition (ASR-U) is the problem of learning automatic speech recognition (ASR) systems from unpaired speech-only and text-only corpora. While various algorithms exist to solve this problem, a theoretical framework is missing to study their properties and address such issues as sensitivity to hyperparameters and training instability. In this paper, we proposed a general theoretical framework to study the properties of ASR-U systems based on random matrix theory and the theory of neural tangent kernels. Such a framework allows us to prove various learnability conditions and sample complexity bounds of ASR-U. Extensive ASR-U experiments on synthetic languages with three classes of transition graphs provide strong empirical evidence for our theory (code available at cactuswiththoughts/UnsupASRTheory.git).

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Wang, L., Hasegawa-Johnson, M., & Yoo, C. D. (2023). A Theory of Unsupervised Speech Recognition. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 1192–1215). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.67

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 4

67%

Lecturer / Post doc 1

17%

Researcher 1

17%

Readers' Discipline

Computer Science 10

83%

Medicine and Dentistry 1

Engineering 1

A Theory of Unsupervised Speech Recognition

Abstract

References Powered by Scopus

On a Lemma of Littlewood and Offord

The Littlewood-Offord problem and invertibility of random matrices

Inverse Littlewood-Offord theorems and the condition number of random discrete matrices

Cited by Powered by Scopus

Quantum computational infusion in extreme learning machines for early multi-cancer detection

Harnessing artificial emotional intelligence for improved human-computer interactions

UNSUPERVISED SPEECH RECOGNITION WITH N-SKIPGRAM AND POSITIONAL UNIGRAM MATCHING

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline