Identifying and clustering relevant terms in clinical records using unsupervised methods

Borbála Siklósi; Attila Novák

Journal Article

Identifying and clustering relevant terms in clinical records using unsupervised methods

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8791 233-243

DOI: 10.1007/978-3-319-11397-5_18

1Citations

5Readers

Get full text

Abstract

The automatic processing of clinical documents created at clinical settings has become a focus of research in natural language processing. However, standard tools developed for general texts are not applicable or perform poorly on this type of documents. Moreover, several crucial tasks require lexical resources and relational thesauri or ontologies to identify relevant concepts and their connections. In the case of less-resourced languages, such as Hungarian, there are no such lexicons available. The construction of annotated data and their organization requires human expert work. In this paper we show how applying statistical methods can result in a preprocessed, semi-structured transformation of the raw documents that can be used to aid human work. The modules detect and resolve abbreviations, identify multiword terms and derive their similarity, all based on the corpus itself.

Author supplied keywords

Cite

CITATION STYLE

APA

Siklósi, B., & Novák, A. (2014). Identifying and clustering relevant terms in clinical records using unsupervised methods. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8791, 233–243. https://doi.org/10.1007/978-3-319-11397-5_18

Identifying and clustering relevant terms in clinical records using unsupervised methods

Abstract

Author supplied keywords

Cite

Register to see more suggestions