Abstract
The present study meticulously investigates optimization strategies for real-time sign language recognition (SLR) employing the MediaPipe framework. We introduce an innovative multi-modal methodology, amalgamating four distinct Long Short-Term Memory (LSTM) models dedicated to processing skeletal coordinates ascertained from the MediaPipe framework. Rigorous evaluations were executed on esteemed sign language datasets. Empirical findings underscore that the multi-modal approach significantly elevates the accuracy of the SLR model while preserving its real-time capabilities. In comparative analyses with prevalent MediaPipe-based models, our multi-modal strategy consistently manifested superior performance metrics. A distinguishing characteristic of this approach is its inherent adaptability, facilitating modifications within the LSTM layers, rendering it apt for a myriad of challenges and data typologies. Integrating the MediaPipe framework with real-time SLR markedly amplifies recognition precision, signifying a pivotal advancement in the discipline.
Author supplied keywords
Cite
CITATION STYLE
Thanh, N. P., Hoang, N. T., Nguyen, H. N. X., Binh, P. H. T., Hai, V. H. S., & Nhan, H. H. (2023). Exploring MediaPipe optimization strategies for real-time sign language recognition. CTU Journal of Innovation and Sustainable Development, 15(Special issue), 142–152. https://doi.org/10.22144/ctujoisd.2023.045
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.