Evaluation of Deformable Convolution: An Investigation in Image and Video Classification

6Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Convolutional Neural Networks (CNNs) present drawbacks for modeling geometric transformations, caused by the convolution operation’s locality. Deformable convolution (DCON) is a mechanism that solves these drawbacks and improves the robustness. In this study, we clarify the optimal way to replace the standard convolution with its deformable counterpart in a CNN model. To this end, we conducted several experiments using DCONs applied in the layers that conform a small four-layer CNN model and on the four-layers of several ResNets with depths 18, 34, 50, and 101. The models were tested in binary balanced classes with 2D and 3D data. If DCON is used on the first layers of the proposal of model, the computational resources will tend to increase and produce bigger misclassification than the standard CNN. However, if the DCON is used at the end layers, the quantity of Flops will decrease, and the classification accuracy will improve by up to 20% about the base model. Moreover, it gains robustness because it can adapt to the object of interest. Also, the best kernel size of the DCON is three. With these results, we propose a guideline and contribute to understanding the impact of DCON on the robustness of CNNs.

Cite

CITATION STYLE

APA

Burgos Madrigal, A., Romero Bautista, V., Díaz Hernández, R., & Altamirano Robles, L. (2024). Evaluation of Deformable Convolution: An Investigation in Image and Video Classification. Mathematics, 12(16). https://doi.org/10.3390/math12162448

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free