Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/5664

Title: Video assisted speech source separation
Authors: Wang, Wenwu
Cosker, Darren
Hicks, Yulia
Sanei, Saeid
Chambers, Jonathon
Issue Date: 2005
Publisher: © IEEE
Citation: WANG, W. ... et al., 2005. Video assisted speech source separation. IN: Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, USA, 18-23 March, Vol.5, pp. 425-428.
Abstract: In this paper we investigate the problem of integrating the complementary audio and visual modalities for speech separation. Rather than using independence criteria suggested in most blind source separation (BSS) systems, we use the visual feature from a video signal as additional information to optimize the unmixing matrix. We achieve this by using a statistical model characterizing the nonlinear coherence between audio and visual features as a separation criterion for both instantaneous and convolutive mixtures. We acquire the model by applying the Bayesian framework to the fused feature observations based on a training corporus. We point out several key exisiting challenges to the success of the system. Experimental results verify the proposed approach, which outperforms the audio only separation system in a noisy environment, and also provides a solution to the permutation problem.
Description: This is a conference paper [© IEEE]. It is also available at: http://ieeexplore.ieee.org/ Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Version: Published
DOI: 10.1109/ICASSP.2005.1416331
URI: https://dspace.lboro.ac.uk/2134/5664
ISBN: 0780388747
Appears in Collections:Conference Papers and Presentations (Mechanical, Electrical and Manufacturing Engineering)

Files associated with this item:

File Description SizeFormat
Video assisted speech source separation.pdf255.7 kBAdobe PDFView/Open


SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.