Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/6493

Title: Real-time speaker identification for video conferencing
Authors: Saravi, Sara
Zafar, Iffat
Edirisinghe, Eran A.
Kalawsky, Roy S.
Keywords: Speaker identification
Coherent point drift
Lip movement detection
Issue Date: 2010
Publisher: © 2010 SPIE
Citation: SARAVI, S....et al., 2010. Real-time speaker identification for video conferencing. IN: Kehtarnavaz, N. (ed.), Real-Time Image and Video Processing 2010, Proceedings of SPIE, 7724, 77240D, 10pp.
Abstract: Automatic speaker identification in a videoconferencing environment will allow conference attendees to focus their attention on the conference rather than having to be engaged manually in identifying which channel is active and who may be the speaker within that channel. In this work we present a real-time, audio-coupled video based approach to address this problem, but focus more on the video analysis side. The system is driven by the need for detecting a talking human via the use of computer vision algorithms. The initial stage consists of a face detector which is subsequently followed by a lip-localization algorithm that segments the lip region. A novel approach for lip movement detection based on image registration and using the Coherent Point Drift (CPD) algorithm is proposed. Coherent Point Drift (CPD) is a technique for rigid and non-rigid registration of point sets. We provide experimental results to analyse the performance of the algorithm when used in monitoring real life videoconferencing data.
Description: Copyright 2010 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic electronic or print reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited. This paper can also be found at: http://dx.doi.org/10.1117/12.854846
Version: Published
DOI: 10.1117/12.854846
URI: https://dspace.lboro.ac.uk/2134/6493
Appears in Collections:Conference Papers (Computer Science)

Files associated with this item:

File Description SizeFormat
Eran1.pdf1.08 MBAdobe PDFView/Open

 

SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.