Abstract
A system that segments and labels tabla strokes from real performances is described. Performance is evaluated on a large database taken from three performers under different recording conditions, containing a total of 16,834 strokes. The current work extends previous work by Gillet and Richard (2003) on categorizing tabla strokes, by using a larger, more diverse database that includes their data as a benchmark, and by testing neural networks and treebased classification methods. First, the time-domain signal was segmented using complex-domain thresholding that looked for sudden changes in amplitude and phase discontinuities. At the optimal point on the ROC curve, false positives were less than 1% and false negatives were less than 2%. Then, classification was performed using a multivariate Gaussian model (mv gauss) as well as non-parametric techniques such as probabilistic neural networks (pnn), feed-forward neural networks (ffnn), and tree-based classifiers. Two evaluation protocols were used. The first used 10-fold cross validation. The recognition rate averaged over several experiments that contained 10-15 classes was 92% for the mv gauss, 94% for the ffnn and pnn, and 84% for the tree based classifier. To test generalization, a more difficult independent evaluation was undertaken in which no test strokes came from the same recording as the training strokes. The average recognition rate over a wide variety of test conditions was 76% for the mv gauss, 83% for the ffnn, 76% for the pnn, and 66% for the tree classifier. © 2005 Queen Mary, University of London.
Author supplied keywords
Cite
CITATION STYLE
Chordia, P. (2005). Segmentation and recognition of tabla strokes. In ISMIR 2005 - 6th International Conference on Music Information Retrieval (pp. 107–114).
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.