Abstract
We argue that the present setting of semi-supervised learning on graphs may result in unfair comparisons, due to its potential risk of over-tuning hyper-parameters for models. In this paper, we highlight the significant influence of tuning hyper-parameters, which leverages the label information in the validation set to improve the performance. To explore the limit of over-tuning hyper-parameters, we propose ValidUtil, an approach to fully utilize the label information in the validation set through an extra group of hyper-parameters. With ValidUtil, even GCN can easily get high accuracy of 85.8% on Cora. To avoid over-tuning, we merge the training set and the validation set and construct an i.i.d. graph benchmark (IGB) consisting of 4 datasets. Each dataset contains 100 i.i.d. graphs sampled from a large graph to reduce the evaluation variance. Our experiments suggest that IGB is a more stable benchmark than previous datasets for semi-supervised learning on graphs. Our code and data are released at https://github.com/THUDM/IGB/.
Cite
CITATION STYLE
Li, Z., Ding, M., Li, W., Wang, Z., Zeng, Z., Cen, Y., & Tang, J. (2022). Rethinking the Setting of Semi-supervised Learning on Graphs. In IJCAI International Joint Conference on Artificial Intelligence (pp. 3243–3249). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2022/450
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.