Background: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers' language. Objective: Our objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV. Methods: The CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system's utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages. Results: The CAU system identified 88,994 unique terms 1- to 7-grams ("n-grams" are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed. Conclusion: The CAU system is effective for generating a list of candidate terms for human review during CHV development.
CITATION STYLE
Doing-Harris, K. M., & Zeng-Treitler, Q. (2011). Computer-assisted update of a consumer health vocabulary through mining of social network data. Journal of Medical Internet Research, 13(2). https://doi.org/10.2196/jmir.1636
Mendeley helps you to discover research relevant for your work.