Augmentations in Graph Contrastive Learning: Current Methodological Flaws & Towards Better Practices

30Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Graph classification has a wide range of applications in bioinformatics, social sciences, automated fake news detection, web document classification, and more. In many practical scenarios, including web-scale applications, labels are scarce or hard to obtain. Unsupervised learning is thus a natural paradigm for these settings, but its performance often lags behind that of supervised learning. However, recently contrastive learning (CL) has enabled unsupervised computer vision models to perform comparably to supervised models. Theoretical and empirical works analyzing visual CL frameworks find that leveraging large datasets and task relevant augmentations is essential for CL framework success. Interestingly, graph CL frameworks report high performance while using orders of magnitude smaller data, and employing domain-agnostic graph augmentations (DAGAs) that can corrupt task relevant information. Motivated by these discrepancies, we seek to determine why existing graph CL frameworks continue to perform well, and identify flawed practices in graph data augmentation and popular graph CL evaluation protocols. We find that DAGA can destroy task-relevant information and harm the model's ability to learn discriminative representations. We also show that on small benchmark datasets, the inductive bias of graph neural networks can significantly compensate for these limitations, while on larger graph classification tasks commonly-used DAGAs perform poorly. Based on our findings, we propose better practices and sanity checks for future research and applications, including adhering to principles in visual CL when designing context-aware graph augmentations. For example, in graph-based document classification, which can be used for better web search, we show task-relevant augmentations improve accuracy by up to 20.

Cite

CITATION STYLE

APA

Trivedi, P., Lubana, E. S., Yan, Y., Yang, Y., & Koutra, D. (2022). Augmentations in Graph Contrastive Learning: Current Methodological Flaws & Towards Better Practices. In WWW 2022 - Proceedings of the ACM Web Conference 2022 (pp. 1538–1549). Association for Computing Machinery, Inc. https://doi.org/10.1145/3485447.3512200

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free