naqvi.pdf (2.27 MB)
A multimodal approach to blind source separation of moving sources
journal contribution
posted on 2010-11-19, 09:12 authored by Mohsen Naqvi, Miao Yu, Jonathon ChambersA novel multimodal approach is proposed to solve the
problem of blind source separation (BSS) of moving sources. The
challenge of BSS for moving sources is that the mixing filters are
time varying; thus, the unmixing filters should also be time varying,
which are difficult to calculate in real time. In the proposed approach,
the visual modality is utilized to facilitate the separation for
both stationary and moving sources. The movement of the sources
is detected by a 3-D tracker based on video cameras. Positions
and velocities of the sources are obtained from the 3-D tracker
based on a Markov Chain Monte Carlo particle filter (MCMC-PF),
which results in high sampling efficiency. The full BSS solution
is formed by integrating a frequency domain blind source separation
algorithm and beamforming: if the sources are identified
as stationary for a certain minimum period, a frequency domain
BSS algorithm is implemented with an initialization derived from
the positions of the source signals. Once the sources are moving, a
beamforming algorithm which requires no prior statistical knowledge
is used to perform real time speech enhancement and provide
separation of the sources. Experimental results confirm that
by utilizing the visual modality, the proposed algorithm not only
improves the performance of the BSS algorithm and mitigates the
permutation problem for stationary sources, but also provides a
good BSS performance for moving sources in a low reverberant
environment.
History
School
- Mechanical, Electrical and Manufacturing Engineering
Citation
NAQVI, S.M., YU, M. and CHAMBERS, J.A., 2010. A multimodal approach to blind source separation of moving sources. IEEE Journal of selected topics in signal processing, 4(5), pp 895- 910Publisher
© IEEEVersion
- VoR (Version of Record)
Publication date
2010Notes
This is a journal article[© IEEE]. It is also available at: http://ieeexplore.ieee.org/ Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.ISSN
1932-4553Publisher version
Language
- en