Obesity is a chronic disease with an increasing impact on the world’s population. In this work, we present a method to identify obesity using text mining techniques and information related to body weight measures and obesity comorbidities. We used a dataset of 2412 de-identified medical records that contains labels for two classification problems. The first classification problem recognizes between obesity, overweight, normal weight, and underweight. The second problem of classification corresponds to the obesity types under the obesity category to recognize between super obesity, morbid obesity, severe obesity and moderate obesity. We used a Bag of Words approach to represent the records together with unigram and bigram representation of the features. We used Support Vector Machine and Naïve Bayes together with ten-fold cross validation to evaluate and compare performances. In general, our results show that Support Vector Machine obtains better performances than Naïve Bayes for both classification problems. We also observed that bigram representation improves performance compared with unigram representation.
CITATION STYLE
Figueroa, R. L., & Flores, C. A. (2015). Extracting information from electronic medical records to identify obesity status of a patient based on comorbidities and bodyweight measures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9456, pp. 37–46). Springer Verlag. https://doi.org/10.1007/978-3-319-26508-7_4
Mendeley helps you to discover research relevant for your work.