Table representation learning using heterogeneous graph embedding

5Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Tables, especially when having complex layouts, contain rich semantic information. However, effectively learning from tables to uncover such semantic information remains challenging. The rapid progress in natural language processing does not necessarily correspond to equivalent advancements in table parsing, which often requires joint visual and language modeling. Indeed, humans can quickly derive semantic meaning from table entries by associating them with corresponding column and/or row headers. Motivated by this observation, we propose a new heterogeneous Graph-based Table Representation Learning (GTRL) framework. GTRL combines graph-based visual modeling with sequence-based language modeling to learn granular per-cell embeddings that are sensitive to the semantic meaning of cells within their corresponding table context. We systematically evaluate the proposed GTRL framework using two datasets: a new adhesive table benchmark comprising complex tables extracted from industrial documents for learning per-entry semantics, and a publicly available large-scale dataset that enables learning header semantics from column tables. Experimental results demonstrate the competitive performance of the proposed GTRL, which often exhibits reduced computational complexity compared to state-of-the-art table representation learning models.

Cite

CITATION STYLE

APA

Tchuitcheu, W. C., Lu, T., & Dooms, A. (2024). Table representation learning using heterogeneous graph embedding. Pattern Recognition, 156. https://doi.org/10.1016/j.patcog.2024.110734

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free