Recent open-domain TableQA models are typically implemented as retriever-reader pipelines. The retriever component is usually a variant of the Dense Passage Retriever, which computes the similarities between questions and tables based on a single representation of each. These fixed vectors can be insufficient to capture fine-grained features of potentially very big tables with heterogeneous row/column information. We address this limitation by 1) applying late interaction models which enforce a finer-grained interaction between question and table embeddings at retrieval time. In addition, we 2) incorporate a joint training scheme of the retriever and reader with explicit table-level signals, and 3) embed a binary relevance token as a prefix to the answer generated by the reader, so we can determine at inference time whether the table used to answer the question is reliable and filter accordingly. The combined strategies set a new state-to-the-art performance on two public open-domain TableQA datasets.
CITATION STYLE
Lin, W., Blloshmi, R., Byrne, B., de Gispert, A., & Iglesias, G. (2023). LI-RAGE: Late Interaction Retrieval Augmented Generation with Explicit Signals for Open-Domain Table Question Answering. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 1557–1566). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-short.133
Mendeley helps you to discover research relevant for your work.