Pathology reports are used to store information about cells and tissues of a patient, and they are crucial to monitor the health of individuals and population groups. In this work we present an evaluation of supervised text classification models for the prediction of relevant categories in pathology reports. Our aim is to integrate automatic classifiers to improve the current workflow of medical experts, and we implement and evaluate different machine learning approaches for a large number of categories. Our results show that we are able to predict nominal categories with high average f-score (81.3%), and we can improve over the majority class baseline by relying on Naive Bayes and feature selection. We also find that the classification of numeric categories is harder, and deeper analysis would be required to predict these labels.
Mendeley saves you time finding and organizing research
There are no full text links
Choose a citation style from the tabs below