Robot-assisted minimally-invasive surgery is increasingly used in clinical practice. Force feedback offers potential to develop haptic feedback for surgery systems. Forces can be estimated in a vision-based way by capturing deformation observed in 2D-image sequences with deep learning models. Variations in tissue appearance and mechanical properties likely influence force estimation methods' generalization. In this work, we study the generalization capabilities of different spatial and spatio-temporal deep learning methods across different tissue samples. We acquire several data-sets using a clinical laparoscope and use both purely spatial and also spatio-temporal deep learning models. The results of this work show that generalization across different tissues is challenging. Nevertheless, we demonstrate that using spatio-temporal data instead of individual frames is valuable for force estimation. In particular, processing spatial and temporal data separately by a combination of a ResNet and GRU architecture shows promising results with a mean absolute error of 15.450 compared to 19.744 mN of a purely spatial CNN.
CITATION STYLE
Behrendt, F., Gessert, N., & Schlaefer, A. (2020). Generalization of spatio-temporal deep learning for vision-based force estimation. In Current Directions in Biomedical Engineering (Vol. 6). Walter de Gruyter GmbH. https://doi.org/10.1515/cdbme-2020-0024
Mendeley helps you to discover research relevant for your work.