Revisiting Alternative Experimental Settings for Evaluating Top-N Item Recommendation Algorithms

Wayne Xin Zhao; Junhua Chen; Pengfei Wang; Qi Gu; Ji Rong Wen

Conference ProceedingsOPEN ACCESS

Revisiting Alternative Experimental Settings for Evaluating Top-N Item Recommendation Algorithms

International Conference on Information and Knowledge Management, Proceedings (2020) 2329-2332

DOI: 10.1145/3340531.3412095

70Citations

32Readers

Get full text

Abstract

Top-N item recommendation has been a widely studied task from implicit feedback. Although much progress has been made with neural methods, there is increasing concern on appropriate evaluation of recommendation algorithms. In this paper, we revisit alternative experimental settings for evaluating top-N recommendation algorithms, considering three important factors, namely dataset splitting, sampled metrics and domain selection. We select eight representative recommendation algorithms (covering both traditional and neural methods) and construct extensive experiments on a very large dataset. By carefully revisiting different options, we make several important findings on the three factors, which directly provide useful suggestions on how to appropriately set up the experiments for top-N item recommendation.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhao, W. X., Chen, J., Wang, P., Gu, Q., & Wen, J. R. (2020). Revisiting Alternative Experimental Settings for Evaluating Top-N Item Recommendation Algorithms. In International Conference on Information and Knowledge Management, Proceedings (pp. 2329–2332). Association for Computing Machinery. https://doi.org/10.1145/3340531.3412095

Revisiting Alternative Experimental Settings for Evaluating Top-N Item Recommendation Algorithms

Abstract

Author supplied keywords

Cite

Register to see more suggestions