Predicting Long-Term Citations from Short-Term Linguistic Influence

0Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

A standard measure of the influence of a research paper is the number of times it is cited. However, papers may be cited for many reasons, and citation count offers limited information about the extent to which a paper affected the content of subsequent publications. We therefore propose a novel method to quantify linguistic influence in timestamped document collections. There are two main steps: first, identify lexical and semantic changes using contextual embeddings and word frequencies; second, aggregate information about these changes into per-document influence scores by estimating a high-dimensional Hawkes process with a low-rank parameter matrix. We show that this measure of linguistic influence is predictive of future citations: the estimate of linguistic influence from the two years after a paper's publication is correlated with and predictive of its citation count in the following three years. This is demonstrated using an online evaluation with incremental temporal training/test splits, in comparison with a strong baseline that includes predictors for initial citation counts, topics, and lexical features.

Cite

CITATION STYLE

APA

Soni, S., Bamman, D., & Eisenstein, J. (2022). Predicting Long-Term Citations from Short-Term Linguistic Influence. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 5729–5745). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.418

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free