Assessing agreement between multiple raters with missing rating information, applied to breast cancer tumour grading

14Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

Background: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only 'moderate' agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1-3 scale, provided by 732 pathologists for 52 samples. Methodology/principal findings: We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1-2 and 2-3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively 'easy' set of samples. Conclusions/significance: Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the 'true' grade of many of the breast cancer tumours, a fact often ignored in clinical studies. © 2008 Fanshawe et al.

Cite

CITATION STYLE

APA

Fanshawe, T. R., Lynch, A. G., Ellis, I. O., Green, A. R., & Hanka, R. (2008). Assessing agreement between multiple raters with missing rating information, applied to breast cancer tumour grading. PLoS ONE, 3(8). https://doi.org/10.1371/journal.pone.0002925

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free