The need for automatic speech recognition systems to be robust with respect to changes in their acoustical environment has become more widely appreciated in recent years, as more systems are finding their way into practical applications. Although the issue of environmental robustness has received only a small fraction of the attention devoted to speaker independence, even speech recognition systems that are designed to be speaker independent frequently perform very poorly when they are tested using a different type of microphone or acoustical environment from the one with which they were trained. There are several different ways of building acoustical robustness into speech recognition systems.
Acoustical and Environmental Robustness in Automatic Speech Recognition employs the approach of transforming speech recorded from a single microphone in the application environment so that it more closely matches the important acoustical characteristics of the speech that was used to train the recognition system. The book builds on the older techniques of spectral subtraction and spectral normalization, which were originally developed to enhance the quality of degraded speech for human listeners. Spectral subtraction and spectral normalization were designed to ameliorate the effects of two complementary types of environmental degradation: additive noise and unknown linear filtering. The most important contribution in this book is the development of a family of algorithms that jointly compensate for the effects of these two types of degradation. This unified approach to signal normalization provides significantly better recognition accuracy than the independent compensation strategies developed in prior research.
The algorithms described in this monograph, such as codeword-dependent cepstral normalization (CDCN) and blind signal-to-noise-ratio cepstral normalization (BSDCN), have been shown to provide major improvements in recognition accuracy for speech systems in offices using desktop microphones, in automobiles, and over telephone lines. Although originally developed for speech recognition systems using discrete hidden Markow models, these algorithms are effective when applied to systems that use semi-continuous hidden Markow models as well. Real-time implementations have been developed for the compensation algorithms using workstations with onboard digital signal processors.
Acoustical and Environmental Robustness in Automatic Speech Recognition provides a comprehensive review and comparison of the major single-channel compensation strategies currently in the literature. It develops a unified cepstral respresentation that facilitates joint compensation for the effects of noise, filtering and frequency warping. Finally, it describes and explains the compensation algorithms that have been developed to compensate for these types of environmental degradation, and it provides the details needed to implement the algorithms. As such, the book serves as an excellent reference and may be used as the text for an advanced course on the subject.
|List of Figures|
|List of Tables|
|Frequency Domain Processing|
|The SDCN Algorithm|
|The CDCN Algorithm|
|Summary of Results|
|Signal Processing in Sphinx|
|The Bilinear Transform|
|Spectral Estimation Issues|
|MMSE Estimation in the CDCN Algorithm|
|Maximum Likelihood via the EM Algorithm|
|Estimation of Noise and Spectral Tilt|
|Vocabulary and Pronunciation Dictionary|
|Table of Contents provided by Publisher. All Rights Reserved.|
Series: The Springer International Series in Engineering and Computer Science
Number Of Pages: 186
Published: 30th November 1992
Country of Publication: NL
Dimensions (cm): 23.5 x 15.5 x 1.27
Weight (kg): 1.05