PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts

  • Liu C
  • Dahlmeier D
  • Ng H
  • 31


    Mendeley users who have this article in their library.
  • 16


    Citations of this article.


We present PEM, the first fully automatic metric to evaluate the quality of paraphrases, and consequently, that of paraphrase generation systems. Our metric is based on three criteria: adequacy, fluency, and lexical dissimilarity. The key component in our metric is a robust and shallow semantic similarity measure based on pivot language N-grams that allows us to approximate adequacy independently of lexical similarity. Human evaluation shows that PEM achieves high correlation with human judgments.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

  • SCOPUS: 2-s2.0-80053289942
  • SGR: 80053289942
  • ISBN: 1932432868
  • PUI: 362643567


  • C Liu

  • Daniel Dahlmeier

  • HT Ng

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free