Learning Contextualized Document Representations for Healthcare Answer Retrieval

10Citations
Citations of this article
35Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present Contextual Discourse Vectors (CDV), a distributed document representation for efficient answer retrieval from long healthcare documents. Our approach is based on structured query tuples of entities and aspects from free text and medical taxonomies. Our model leverages a dual encoder architecture with hierarchical LSTM layers and multi-task training to encode the position of clinical entities and aspects alongside the document discourse. We use our continuous representations to resolve queries with short latency using approximate nearest neighbor search on sentence level. We apply the CDV model for retrieving coherent answer passages from nine English public health resources from the Web, addressing both patients and medical professionals. Because there is no end-to-end training data available for all application scenarios, we train our model with self-supervised data from Wikipedia. We show that our generalized model significantly outperforms several state-of-the-art baselines for healthcare passage ranking and is able to adapt to heterogeneous domains without additional fine-tuning.

Cite

CITATION STYLE

APA

Arnold, S., Van Aken, B., Grundmann, P., Gers, F. A., & Löser, A. (2020). Learning Contextualized Document Representations for Healthcare Answer Retrieval. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 1332–1343). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380208

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free