On the Relevance of Temporal Features for Medical Ultrasound Video Recognition

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Many medical ultrasound video recognition tasks involve identifying key anatomical features regardless of when they appear in the video suggesting that modeling such tasks may not benefit from temporal features. Correspondingly, model architectures that exclude temporal features may have better sample efficiency. We propose a novel multi-head attention architecture that incorporates these hypotheses as inductive priors to achieve better sample efficiency on common ultrasound tasks. We compare the performance of our architecture to an efficient 3D CNN video recognition model in two settings: one where we expect not to require temporal features and one where we do. In the former setting, our model outperforms the 3D CNN - especially when we artificially limit the training data. In the latter, the outcome reverses. These results suggest that expressive time-independent models may be more effective than state-of-the-art video recognition models for some common ultrasound tasks in the low-data regime. Code is available at https://github.com/MedAI-Clemson/pda_detection.

Cite

CITATION STYLE

APA

Smith, D. H., Lineberger, J. P., & Baker, G. H. (2023). On the Relevance of Temporal Features for Medical Ultrasound Video Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14221 LNCS, pp. 744–753). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43895-0_70

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free