Speech Segmentation Using Dynamic Windows and Thresholds for Arabic and English Languages
- 1 Arab Open University, Kuwait
Abstract
Segmentation of audio data such as human speech (splitting each word in separate audio file – .WAV file) has been a major concern when working with multimedia such as recordings from radio or TV. The main focus of the segmentation of boundaries of spoken language has been on using energy and zero crossing thresholds for endpoint detection. Errors in endpoint detection are still a main cause of low accuracy of segmentation systems. The goal of this research is to develop an efficient algorithm in order to segment the speech of human in both languages of English and Arabic in different speaking speed with high accuracy. Simulation results show that the developed algorithm achieved high accuracy when segmenting human speech in English language up to 91.6% in average, while it is 89.0% of Arabic language.
DOI: https://doi.org/10.3844/jcssp.2018.485.490
Copyright: © 2018 Yahia Hasan Jazyah. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,916 Views
- 2,032 Downloads
- 4 Citations
Download
Keywords
- Audio
- Voice
- Speech
- Segmentation