Sparse additive generative models of text

  • Eisenstein J
  • Ahmed A
  • Xing E
  • 230

    Readers

    Mendeley users who have this article in their library.
  • 0

    Citations

    Citations of this article.

Abstract

Generative models of text typically associate a multinomial with every class label or topic. Even in simple models this requires the estimation of thousands of parameters; in multifaceted latent variable models, standard approaches require additional latent “switching” variables for every token, complicating inference. In this paper, we propose an alternative generative model for text. The central idea is that each class label or latent topic is endowed with a model of the deviation in log-frequency from a constant background distribution. This approach has two key advantages: we can enforce sparsity to prevent overfitting, and we can combine generative facets through simple addition in log space, avoiding the need for latent switching variables. We demonstrate the applicability of this idea to a range of scenarios: classi- fication, topic modeling, and more complex multifaceted generative models. 1.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Jacob Eisenstein

  • Amr Ahmed

  • Eric P Ep Xing

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free