Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/12238

Title: Speech separation with dereverberation-based pre-processing incorporating visual cues
Authors: Khan, Muhammad Salman
Naqvi, Syed M.
Chambers, Jonathon
Issue Date: 2013
Publisher: CHiME
Citation: KHAN, M.S., NAQVI, S.M. and CHAMBERS, J., 2013. Speech separation with dereverberation-based pre-processing incorporating visual cues. IN: Proceedings of the 2nd International workshop on machine listening in multisource environments (CHIME), Vancouver, Canada, 1 June 2013, 2pp.
Abstract: Humans are skilled in selectively extracting a single sound source in the presence of multiple simultaneous sounds. They (individuals with normal hearing) can also robustly adapt to changing acoustic environments with great ease. Need has arisen to incorporate such abilities in machines which would enable multiple application areas such as human-computer interaction, automatic speech recognition, hearing aids and hands-free telephony. This work addresses the problem of separating multiple speech sources in realistic reverberant rooms using two microphones. Different monaural and binaural cues have previously been modeled in order to enable separation. Binaural spatial cues i.e. the interaural level difference (ILD) and the inter- aural phase difference (IPD) have been modeled [1] in the time-frequency (TF) domain that exploit the differences in the intensity and the phase of the mixture signals (because of the different spatial locations) observed by two microphones (or ears). The method performs well with no or little rever- beration but as the amount of reverberation increases and the sources approach each other, the binaural cues are distorted and the interaural cues become indistinct, hence, degrading the separation performance. Thus, there is a demand for exploiting additional cues, and further signal processing is required at higher levels of reverberation.
Description: This is a conference paper for the 2nd International Workshop on Machine Listening in Multisource Environments 1st June 2013, Vancouver, Canada (in conjuction with ICASSP 2013). The conference website is at: http://spandh.dcs.shef.ac.uk/chime_workshop/index.html
Version: Accepted for publication
URI: https://dspace.lboro.ac.uk/2134/12238
Appears in Collections:Conference Papers and Contributions (Mechanical, Electrical and Manufacturing Engineering)

Files associated with this item:

File Description SizeFormat
Speech separation-salman_CHIME2013.pdfAccepted version76.87 kBAdobe PDFView/Open

 

SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.