In this article, we compare the performance of a state-of-the-art segmentation network (UNet) on two different glioblastoma (GB) segmentation datasets. Our experiments show that the same training procedure yields almost twice as bad results on the retrospective clinical data compared to the BraTS challenge data (in terms of Dice score). We discuss possible reasons for such an outcome, including inter-rater variability and high variability in magnetic resonance imaging (MRI) scanners and scanner settings. The high performance of segmentation models, demonstrated on preselected imaging data, does not bring the community closer to using these algorithms in clinical settings. We believe that a clinically applicable deep learning architecture requires a shift from unified datasets to heterogeneous data. © 2021 European Federation for Medical Informatics (EFMI) and IOS Press.
CITATION STYLE
Kurmukov, A., Dalechina, A., Saparov, T., Belyaev, M., Zolotova, S., Golanov, A., & Nikolaeva, A. (2021). Challenges in building of deep learning models for glioblastoma segmentation: Evidence from clinical data. In Public Health and Informatics: Proceedings of MIE 2021 (pp. 298–302). IOS Press. https://doi.org/10.3233/SHTI210168
Mendeley helps you to discover research relevant for your work.