Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/15425

Title: Speaker independent isolated word recognition
Authors: Mwangi, Elijah
Issue Date: 1987
Publisher: © E. Mwangi
Abstract: The work presented in this thesis concerns the recognition of isolated words using a pattern matching approach. In such a system, an unknown speech utterance, which is to be identified, is transformed into a pattern of characteristic features. These features are then compared with a set of pre-stored reference patterns that were generated from the vocabulary words. The unknown word is identified as that vocabulary word for which the reference pattern gives the best match. One of the major difficul ties in the pattern comparison process is that speech patterns, obtained from the same word, exhibit non-linear temporal fluctuations and thus a high degree of redundancy. The initial part of this thesis considers various dynamic time warping techniques used for normalizing the temporal differences between speech patterns. Redundancy removal methods are also considered, and their effect on the recognition accuracy is assessed. Although the use of dynamic time warping algorithms provide considerable improvement in the accuracy of isolated word recognition schemes, the performance is ultimately limited by their poor ability to discriminate between acoustically similar words. Methods for enhancing the identification rate among acoustically similar words, by using common pattern features for similar sounding regions, are investigated. Pattern matching based, speaker independent systems, can only operate with a high recognition rate, by using multiple reference patterns for each of the words included in the vocabulary. These patterns are obtained from the utterances of a group of speakers. The use of multiple reference patterns, not only leads to a large increase in the memory requirements of the recognizer, but also an increase in the computational load. A recognition system is proposed in this thesis, which overcomes these difficulties by (i) employing vector quantization techniques to reduce the storage of reference patterns, and (ii) eliminating the need for dynamic time warping which reduces the computational complexity of the system. Finally, a method of identifying the acoustic structure of an utterance in terms of voiced, unvoiced, and silence segments by using fuzzy set theory is proposed. The acoustic structure is then employed to enhance the recognition accuracy of a conventional isolated word recognizer.
Description: A Doctoral Thesis. Submitted in partial fulfilment of the requirements for the award of Doctor of Philosophy of Loughborough University.
URI: https://dspace.lboro.ac.uk/2134/15425
Appears in Collections:PhD Theses (Mechanical, Electrical and Manufacturing Engineering)

Files associated with this item:

File Description SizeFormat
Thesis-1987-Mwangi.pdf4.8 MBAdobe PDFView/Open
Form-1987-Mwangi.pdf42.94 kBAdobe PDFView/Open


SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.