Robust speaker modeling based on constrained nonnegative tensor factorization

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nonnegative tensor factorization is an extension of nonnegative matrix factorization(NMF) to a multilinear case, where nonnegative constraints are imposed on the PARAFAC/Tucker model. In this paper, to identify speaker from a noisy environment, we propose a new method based on PARAFAC model called constrained Nonnegative Tensor Factorization (cNTF). Speech signal is encoded as a general higher order tensor in order to learn the basis functions from multiple interrelated feature subspaces. We simulate a cochlear-like peripheral auditory stage which is motivated by the auditory perception mechanism of human being. A sparse speech feature representation is extracted by cNTF which is used for robust speaker modeling. Orthogonal and nonsmooth sparse control constraints are further imposed on the PARAFAC model in order to preserve the useful information of each feature subspace in the higher order tensor. Alternating projection algorithm is applied to obtain a stable solution. Experiments results demonstrate that our method can improve the recognition accuracy specifically in noise environment. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Wu, Q., Zhang, L., & Shi, G. (2008). Robust speaker modeling based on constrained nonnegative tensor factorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5263 LNCS, pp. 11–20). Springer Verlag. https://doi.org/10.1007/978-3-540-87732-5_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free