Eran1.pdf (1.06 MB)
Real-time speaker identification for video conferencing
conference contribution
posted on 2010-07-14, 15:15 authored by Sara SaraviSara Saravi, Iffat Zafar, Eran Edirisinghe, Roy KalawskyRoy KalawskyAutomatic speaker identification in a videoconferencing environment will allow conference attendees to focus their
attention on the conference rather than having to be engaged manually in identifying which channel is active and who
may be the speaker within that channel. In this work we present a real-time, audio-coupled video based approach to
address this problem, but focus more on the video analysis side. The system is driven by the need for detecting a talking
human via the use of computer vision algorithms. The initial stage consists of a face detector which is subsequently
followed by a lip-localization algorithm that segments the lip region. A novel approach for lip movement detection based
on image registration and using the Coherent Point Drift (CPD) algorithm is proposed. Coherent Point Drift (CPD) is a
technique for rigid and non-rigid registration of point sets. We provide experimental results to analyse the performance
of the algorithm when used in monitoring real life videoconferencing data.
History
School
- Science
Department
- Computer Science
Citation
SARAVI, S....et al., 2010. Real-time speaker identification for video conferencing. IN: Kehtarnavaz, N. (ed.), Real-Time Image and Video Processing 2010, Proceedings of SPIE, 7724, 77240D, 10pp.Publisher
© 2010 SPIEVersion
- VoR (Version of Record)
Publication date
2010Notes
Copyright 2010 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic electronic or print reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited. This paper can also be found at: http://dx.doi.org/10.1117/12.854846Language
- en