Surgical tool presence detection is one of the key problems in automatic surgical video content analysis. Solving this problem benefits many applications such as the evaluation of surgical instrument usage and automatic surgical report generation. Given the fact that each video is only sparsely labeled at the frame level, meaning that only a small portion of video frames will be properly labeled, existing approaches only model this problem as an image (frame) classification problem without considering temporal information in surgical videos. In this paper, we propose a deep neural network model utilizing both spatial and temporal information from surgical videos for surgical tool presence detection. The proposed model uses Graph Convolutional Networks (GCNs) along the temporal dimension to learn better features by considering the relationship between continuous video frames. To the best of our knowledge, this is the first work taking videos as input to solve the surgical tool presence detection problem. Our experiments demonstrate the employment of temporal information offers a significant improvement to this problem, and the proposed approach achieves better performance than all state-of-the-art methods.
CITATION STYLE
Wang, S., Xu, Z., Yan, C., & Huang, J. (2019). Graph Convolutional Nets for Tool Presence Detection in Surgical Videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11492 LNCS, pp. 467–478). Springer Verlag. https://doi.org/10.1007/978-3-030-20351-1_36
Mendeley helps you to discover research relevant for your work.