Sign language is a complex way of communication mostly used for deaf people where hands, limbs, head and facial expressions are used to communicate. Finger spelling is a system where each letter of the alphabet is represented by a unique and discrete movement of the hand. In this paper, we are interested in studying the properties of the spatial pyramid matching descriptor for finger spelling recognition. This method is a simple extension of an orderless bag-of-features image representation where local features are mapped to multi-resolution histograms and compute a weighted histogram intersection. The performance of the approach is evaluated on a dataset of real images of the American Sign Language (ASL) finger spelling. We conduct experiments considering three evaluation protocols. The first uses 10% of the data as training and the remaining as test, we achieve an accuracy rate of 92.50%. The second protocol considers 50% as training data, the accuracy rate was about 97.1%. Finally, in the third protocol, we perform a 5-fold cross-validation, where we achieve an accuracy rate of 97.9%. Our method achieves the best results in all three protocols when compared to state-of-the-art approaches. In all the experiments, we also evaluate the influence of the weights of the multi-resolution histograms. They do not have a significant influence in the experimental results.
CITATION STYLE
Silva, S., Schwartz, W. R., & Cámara-Chávez, G. (2014). Spatial pyramid matching for finger spelling recognition in intensity images. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8827, pp. 629–636). Springer Verlag. https://doi.org/10.1007/978-3-319-12568-8_77
Mendeley helps you to discover research relevant for your work.