One hundred and three women were examined independently for presymptomatic breast disease by two nurses and two surgeons who recorded physical findings and their recommendations for further clinical workup. Agreement between the observers beyond what would have been expected by chance was assessed by a new extension of the statistic κ which allows multi-level scales of measurement, more than two observers (not necessarily the same for each subject), and comparisons between and within subsets of observers. Agreement between nurses and between the nurse-surgeon pairs was not significantly better than would have been expected by chance. Agreement within surgeon pairs was only slightly better than chance (overall κ for physical findings and recommendations being 0.42 and 0.32 respectively). Agreement between surgeons was generally better for physical findings than for recommendations and was best for the finding of fibrocystic disease. Future studies to compare the performance of nurses or other allied health professionals with surgeons should, therefore, be designed to allow assessement of the reliability of the standard group. © 1981.
Thomas, D. C., Spitzer, W. O., & MacFarlane, J. K. (1981). Inter-observer error among surgeons and nurses in presymptomatic detection of breast disease. Journal of Chronic Diseases, 34(12), 617–626. https://doi.org/10.1016/0021-9681(81)90061-8