Abstract
Convolutional neural network models perform state of the art accuracy on image classification, localization, and segmentation tasks. A fully convolutional topology, such as U-Net, may be trained on images of one size and perform inference on images of another size. This feature allows researchers to work with images too large to fit into memory by simply dividing the image into small tiles, making predictions on these tiles, and stitching these tiles back together as the prediction of the whole image. We compare how a tiled prediction of a U-Net model compares to a prediction that is based on the whole image. Our results show that using tiling to perform inference results in a significant increase in both false positive and false negative predictions when compared to using the whole image for inference. We are able to modestly improve the predictions by increasing both tile size and amount of tile overlap, but this comes at a greater computational cost and still produces inferior results to using the whole image. Although tiling has been used to produce acceptable segmentation results in the past, we recommend performing inference on the whole image to achieve the best results and increase the state of the art accuracy for CNNs.
Cite
CITATION STYLE
Reina, G. A., & Panchumarthy, R. (2019). Adverse effects of image tiling on convolutional neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11383 LNCS, pp. 25–36). Springer Verlag. https://doi.org/10.1007/978-3-030-11723-8_3
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.