Author name disambiguation (AND) is one of the most vital problems in scientometrics, which has become a great challenge with the rapid growth of academic digital libraries. Existing approaches for this task substantially rely on complex clustering-like architectures, and they usually assume the number of clusters is known beforehand or predict the number by applying another model, which involve increasingly complex and time-consuming architectures. In this paper, we combine simple neural networks with two sets of heuristic rules to explore strong baselines for the author name disambiguation problem without any priori knowledge or estimation about cluster size, which frees the model from unnecessary complexity. On a popular benchmark dataset AMiner, our solution significantly outperforms several state-of-the-art methods both in performance and efficiency, and it still achieves comparable performance with many complex models when only using a group of rules. Experimental results also indicate that gains from sophisticated deep learning techniques are quite modest in the author name disambiguation problem.
CITATION STYLE
Zhang, Z., Yu, B., Liu, T., & Wang, D. (2020). Strong Baselines for Author Name Disambiguation with and Without Neural Networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12084 LNAI, pp. 369–381). Springer. https://doi.org/10.1007/978-3-030-47426-3_29
Mendeley helps you to discover research relevant for your work.