Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Carina Negreanu; Alperen Karaoglu; Jack Williams; Shuang Chen; Daniel Fabian; Andrew Gordon; Chin Yew Lin

Conference ProceedingsOPEN ACCESS

Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

WWW 2022 - Companion Proceedings of the Web Conference 2022 (2022) 1272-1280

DOI: 10.1145/3487553.3524923

1Citations

8Readers

Abstract

Row completion is the task of augmenting a given table of text and numbers with additional, relevant rows. The task divides into two steps: subject suggestion, the task of populating the main column; and gap filling, the task of populating the remaining columns. We present state-of-the-art results for subject suggestion and gap filling measured on a standard benchmark (WikiTables). Our idea is to solve this task by harmoniously combining knowledge base table interpretation and free text generation. We interpret the table using the knowledge base to suggest new rows and generate metadata like headers through property linking. To improve candidate diversity, we synthesize additional rows using free text generation via GPT-3, and crucially, we exploit the metadata we interpret to produce better prompts for text generation. Finally, we verify that the additional synthesized content can be linked to the knowledge base or a trusted web source such as Wikipedia.

Author supplied keywords

Cite

CITATION STYLE

APA

Negreanu, C., Karaoglu, A., Williams, J., Chen, S., Fabian, D., Gordon, A., & Lin, C. Y. (2022). Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model. In WWW 2022 - Companion Proceedings of the Web Conference 2022 (pp. 1272–1280). Association for Computing Machinery, Inc. https://doi.org/10.1145/3487553.3524923

Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Abstract

Author supplied keywords

Cite

Register to see more suggestions