Development and Validation of a Deep Learning Algorithm for Gleason Grading of Prostate Cancer from Biopsy Specimens

Kunal Nagpal; Davis Foote; Fraser Tan; Yun Liu; Po Hsuan Cameron Chen; David F. Steiner; Naren Manoj; Niels Olson; Jenny L. Smith; Arash Mohtashamian; Brandon Peterson; Mahul B. Amin; Andrew J. Evans; Joan W. Sweet; Carol Cheung; Theodorus Van Der Kwast; Ankur R. Sangoi; Ming Zhou; Robert Allan; Peter A. Humphrey; Jason D. Hipp; Krishna Gadepalli; Greg S. Corrado; Lily H. Peng; Martin C. Stumpe; Craig H. Mermel

Journal ArticleOPEN ACCESS

Development and Validation of a Deep Learning Algorithm for Gleason Grading of Prostate Cancer from Biopsy Specimens

JAMA Oncology (2020) 6(9) 1372-1380

DOI: 10.1001/jamaoncol.2020.2485

165Citations

181Readers

Abstract

Importance: For prostate cancer, Gleason grading of the biopsy specimen plays a pivotal role in determining case management. However, Gleason grading is associated with substantial interobserver variability, resulting in a need for decision support tools to improve the reproducibility of Gleason grading in routine clinical practice. Objective: To evaluate the ability of a deep learning system (DLS) to grade diagnostic prostate biopsy specimens. Design, Setting, and Participants: The DLS was evaluated using 752 deidentified digitized images of formalin-fixed paraffin-embedded prostate needle core biopsy specimens obtained from 3 institutions in the United States, including 1 institution not used for DLS development. To obtain the Gleason grade group (GG), each specimen was first reviewed by 2 expert urologic subspecialists from a multi-institutional panel of 6 individuals (years of experience: mean, 25 years; range, 18-34 years). A third subspecialist reviewed discordant cases to arrive at a majority opinion. To reduce diagnostic uncertainty, all subspecialists had access to an immunohistochemical-stained section and 3 histologic sections for every biopsied specimen. Their review was conducted from December 2018 to June 2019. Main Outcomes and Measures: The frequency of the exact agreement of the DLS with the majority opinion of the subspecialists in categorizing each tumor-containing specimen as 1 of 5 categories: nontumor, GG1, GG2, GG3, or GG4-5. For comparison, the rate of agreement of 19 general pathologists' opinions with the subspecialists' majority opinions was also evaluated. Results: For grading tumor-containing biopsy specimens in the validation set (n = 498), the rate of agreement with subspecialists was significantly higher for the DLS (71.7%; 95% CI, 67.9%-75.3%) than for general pathologists (58.0%; 95% CI, 54.5%-61.4%) (P

Cite

CITATION STYLE

APA

Nagpal, K., Foote, D., Tan, F., Liu, Y., Chen, P. H. C., Steiner, D. F., … Mermel, C. H. (2020). Development and Validation of a Deep Learning Algorithm for Gleason Grading of Prostate Cancer from Biopsy Specimens. JAMA Oncology, 6(9), 1372–1380. https://doi.org/10.1001/jamaoncol.2020.2485

Development and Validation of a Deep Learning Algorithm for Gleason Grading of Prostate Cancer from Biopsy Specimens

Abstract

Cite

Register to see more suggestions