Speech end point detection is the process of identifying speech boundary by digital processing technique. The performance of many of the speech processing applications largely depends on accurate end point detection. In this paper, we try to address this important issue and proposed an algorithm to identify the speech boundary. The algorithm based on frame-wise pitch and energy estimation to detect the onset and the terminus of an utterance. The performance of proposed algorithm has been evaluated for three databases and results were compared with the three state of art technique of end point detection. Experimental results reveal the validity of the proposed method and prove the significant improvement in end point detection over other techniques under observation. An accuracy of 71 to 87.6% in start point detection and 59 to 76.6% in end (termination) point detection is achieved by proposed Pitch and Energy based Detection (PED) for ±60 ms resolution window. In terms of error in detection, an average improvement of 26.9 ms in start point and 200.5 ms in end point is attained in compare to other methods for different speech corpus. This investigation clearly indicates that the PED technique offers superior results in terms of accuracy and error in detection for different data conditions.
CITATION STYLE
Shome, N., Laskar, R. H., Kashyap, R., & Bandyopadhyay, S. (2020). A Robust Technique for End Point Detection Under Practical Environment. In Communications in Computer and Information Science (Vol. 1241 CCIS, pp. 131–144). Springer. https://doi.org/10.1007/978-981-15-6318-8_12
Mendeley helps you to discover research relevant for your work.