GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Inexhaustible web content carries abundant perceptible information beyond text. Unfortunately, most prior efforts in pre-trained Language Models (LMs) ignore such cyberrichness, while few of them only employ plain HTMLs, and crucial information in the rendered web, such as visual, layout, and style, are excluded. Intuitively, those perceptible web information can provide essential intelligence to facilitate content understanding tasks. This study presents an innovative Gestalt Enhanced Markup (GEM) Language Model inspired by Gestalt psychological theory for hosting heterogeneous visual information from the render tree into the language model without requiring additional visual input. Comprehensive experiments on multiple downstream tasks, i.e., web question answering and web information extraction, validate GEM superiority.

Cite

CITATION STYLE

APA

Shao, Z., Gao, F., Qi, Z., Xing, H., Bu, J., Yu, Z., … Liu, X. (2023). GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 6132–6145). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.375

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free