Co-occurrence Based Approach for Differentiation of  Speech and Song

Arijit Ghosal; Ranjit Ghoshal

Co-occurrence Based Approach for Differentiation of Speech and Song

Authors

Arijit Ghosal

Department of Information Technology, St. Thomas’ College of Engineering and Technology, Kolkata, West Bengal

Ranjit Ghoshal

Department of Information Technology, St. Thomas’ College of Engineering and Technology, Kolkata, West Bengal

DOI: https://doi.org/10.21467/proceedings.115.17

Synopsis

Discrimination of speech and song through auditory signal is an exciting topic of research. Preceding efforts were mainly discrimination of speech and non-speech but moderately fewer efforts were carried out to discriminate speech and song. Discrimination of speech and song is one of the noteworthy fragments of automatic sorting of audio signal because this is considered to be the fundamental step of hierarchical approach towards genre identification, audio archive generation. The previous efforts which were carried out to discriminate speech and song, have involved frequency domain and perceptual domain aural features. This work aims to propose an acoustic feature which is small dimensional as well as easy to compute. It is observed that energy level of speech signal and song signal differs largely due to absence of instrumental part as a background in case of speech signal. Short Time Energy (STE) is the best acoustic feature which can echo this scenario. For precise study of energy variation co-occurrence matrix of STE is generated and statistical features are extracted from it. For classification resolution, some well-known supervised classifiers have been engaged in this effort. Performance of proposed feature set has been compared with other efforts to mark the supremacy of the feature set.