Accurate prediction of scientific impact is important for scientists, academic recommender systems, and granting organizations alike. Existing approaches rely on many years of leading citation values to predict a scientific paper's citations (a proxy for impact), even though most papers make their largest contributions in the first few years after they are published. In this paper, we tackle a new problem: predicting a new paper's citation time series from the date of publication (i.e., without leading values). We propose HINTS, a novel end-to-end deep learning framework that converts citation signals from dynamic heterogeneous information networks (DHIN) into citation time series. HINTS imputes pseudo-leading values for a paper in the years before it is published from DHIN embeddings, and then transforms these embeddings into the parameters of a formal model that can predict citation counts immediately after publication. Empirical analysis on two real-world datasets from Computer Science and Physics show that HINTS is competitive with baseline citation prediction models. While we focus on citations, our approach generalizes to other "cold start"time series prediction tasks where relational data is available and accurate prediction in early timestamps is crucial.
CITATION STYLE
Jiang, S., Koch, B., & Sun, Y. (2021). HINTS: Citation time series prediction for new publications via dynamic heterogeneous information network embedding. In The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021 (pp. 3158–3167). Association for Computing Machinery, Inc. https://doi.org/10.1145/3442381.3450107
Mendeley helps you to discover research relevant for your work.