Independent Components in Text

  • Kolenda T
  • Hansen L
  • Sigurdsson S
N/ACitations
Citations of this article
38Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this communication we analyze the feasibility ofindependent component analysis (ICA) for dimensionalreduction and representation of word histograms. Theanalysis is carried out in a likelihood frameworkwhich allows estimates of the loadings (sourcesignals), the mixing matrix and the noise level. Inthe face of noisy signals, the estimated sources arenon-linear functionals of the observed signals, incontrast to the linear noise free case. We alsodiscuss the generalizability of the estimated modelsand show that an empirical test error estimate maybe used to optimize model dimensionality, inparticular the optimal number of sources. Whenapplied to word histograms ICA is shown to producerepresentations that are better aligned with thegroup structure in the text data than the LSA.

Cite

CITATION STYLE

APA

Kolenda, T., Hansen, L. K., & Sigurdsson, S. (2000). Independent Components in Text (pp. 235–256). https://doi.org/10.1007/978-1-4471-0443-8_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free