LiteGT: Efficient and Lightweight Graph Transformers

Cong Chen; Chaofan Tao; Ngai Wong

Conference ProceedingsOPEN ACCESS

LiteGT: Efficient and Lightweight Graph Transformers

International Conference on Information and Knowledge Management, Proceedings (2021) 161-170

DOI: 10.1145/3459637.3482272

8Citations

12Readers

Get full text

Abstract

Transformers have shown great potential for modeling long-term dependencies for natural language processing and computer vision. However, little study has applied transformers to graphs, which is challenging due to the poor scalability of the attention mechanism and the under-exploration of graph inductive bias. To bridge this gap, we propose a Lite Graph Transformer (LiteGT) that learns on arbitrary graphs efficiently. First, a node sampling strategy is proposed to sparsify the considered nodes in self-attention with only O (Nlog N) time. Second, we devise two kernelization approaches to form two-branch attention blocks, which not only leverage graph-specific topology information, but also reduce computation further to O (1 over 2 Nlog N). Third, the nodes are updated with different attention schemes during training, thus largely mitigating over-smoothing problems when the model layers deepen. Extensive experiments demonstrate that LiteGT achieves competitive performance on both node classification and link prediction on datasets with millions of nodes. Specifically, Jaccard + Sampling + Dim. reducing setting reduces more than 100x computation and halves the model size without performance degradation.

Author supplied keywords

Cite

CITATION STYLE

APA

Chen, C., Tao, C., & Wong, N. (2021). LiteGT: Efficient and Lightweight Graph Transformers. In International Conference on Information and Knowledge Management, Proceedings (pp. 161–170). Association for Computing Machinery. https://doi.org/10.1145/3459637.3482272

LiteGT: Efficient and Lightweight Graph Transformers

Abstract

Author supplied keywords

Cite

Register to see more suggestions