Abstract
Recordings for acoustic research should ideally be made in a lossless format. However, in some cases pre-existing data may be available in a lossy format such as mp3, prompting the question in how far this compromises the accuracy of acoustic measurements. In order to determine whether this is the case, we compressed 10 recordings of read speech in different compression rates (16-320 kbps), and reconverted them to wav in order to examine the effect of compression on commonly used suprasegmental measures of fundamental frequency (f0), pitch range and level. Results suggest that at compression rates between 56 and 320 kbps, measures of f0and most measures of pitch range and level remain reliable, with mean errors below 2% and often better than that. The skewness of the distribution of f0measurements, however, shows much greater measurement errors, with mean errors of 6.9%-7.6% at compression rates between 96 kbps and 320 kbps, and 44.8% at 16 kbps. We conclude that mp3 compressed recordings can be subjected to the acoustic measurements tested here. Nevertheless, the indeterminacy added by mp3 compression needs to be taken into account when interpreting measurements.
Author supplied keywords
Cite
CITATION STYLE
Fuchs, R., & Maxwell, O. (2016). The effects of mp3 compression on acoustic measurements of fundamental frequency and pitch range. In Proceedings of the International Conference on Speech Prosody (Vol. 2016-January, pp. 523–527). International Speech Communications Association. https://doi.org/10.21437/speechprosody.2016-107
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.