Selective Information Extraction Strategies for Cancer Pathology Reports with Convolutional Neural Networks

  • Yoon H
  • Qiu J
  • Christian J
  • et al.
N/ACitations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

To trust model predictions, it is important to ensure new data scored by the model comes from the same population used for model training. If the model is used to score new data different than the model's training data, then predictions and model performance metrics cannot be trusted. Identifying and excluding these anomalous data points is an important task when using models in the real world. Traditional machine learning algorithms and classifiers don't have the capability to abstain in this case. Here we propose a data-novelty detection algorithm for the Convolutional Neural Network classifier, yielding a rejection score for each new data point scored. It is a post-modeling procedure which examines the distribution of convolution filters to determine if the prediction should be trusted. We apply this algorithm to an information extraction model for a natural language text corpus. We evaluated the algorithm performance using a primary cancer site classification model applied to cancer pathology reports. Results demonstrate that the algorithm is an effective way to exclude cancer pathology reports from model scoring when they do not contain the expected information necessary to accurately classify the primary cancer type.

Cite

CITATION STYLE

APA

Yoon, H.-J., Qiu, J. X., Christian, J. B., Hinkle, J., Alamudun, F., & Tourassi, G. (2020). Selective Information Extraction Strategies for Cancer Pathology Reports with Convolutional Neural Networks (pp. 89–98). https://doi.org/10.1007/978-3-030-16841-4_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free