+612 9045 4394
Advanced Algorithms and Architectures for Speech Understanding : Research Reports Esprit / Project 26. Sip - Giancarlo Pirani

Advanced Algorithms and Architectures for Speech Understanding

Research Reports Esprit / Project 26. Sip

By: Giancarlo Pirani (Editor)

Paperback ISBN: 9783540534020
Number Of Pages: 274

Share This Book:


or 4 easy payments of $48.19 with Learn more
Ships in 7 to 10 business days

This study provides an overview of the major results achieved by a project in the field of natural speech understanding. The book describes the complete speech understanding system, from the speech input of a question to the answer formulated in spoken natural language. All the components of the system are described in detail: speech recognition algorithms, understanding techniques, hardware based on Digital Speech Processing (DSP) and parallel architectures and languages. The main features of the system described are continuous speech, a vocabulary size of 1000 words, training independent of application vocabulary, natural language with limited syntactic coverage and constrained semantic domain relevant to a database enquiry. The book should be useful for researchers involved with the problem of designing a speech understanding system as well as for advanced students interested in the principles and fundamentals.

1 Introduction to the Book.- 1.1 Historical Notes.- 1.2 Overview of the Book.- 2 The Recognition Algorithms.- 2.1 Introduction.- 2.2 System Description.- 2.2.1 System Overview.- 2.2.2 Feature Extraction.- 2.2.3 Mel-based Spectral Analysis.- 2.2.4 Vector Quantization.- 2.2.5 The Phonetic Representation.- Phonetic transcription.- Underlying phonetic structure.- Contextual rules.- 2.3 Lexicon Structure.- 2.3.1 Phonetic Segmentation.- Phonetic classification.- Phonetic segmentation.- 2.4 Word Representation.- 2.4.1 Three-Dimensional DP Matching.- Matching costs.- Duration of micro-segments.- Reliability of micro-segments.- 2.4.2 Lexical Access.- Experimental results.- Use of heuristics.- 2.5 Verification Module.- 2.5.1 The Recognition Units.- 2.5.2 Model Estimation.- 2.5.3 Experimental Results.- 2.5.4 Conclusions.- 2.6 Continuous Speech.- 2.6.1 Control Strategies.- Cascade integration.- Full integration.- 2.6.2 Word Hypothesis Normalization.- 2.6.3 Lattice Filters.- 2.6.4 Efficiency Measures.- 2.6.5 Experimental Results.- 2.7 Conclusions.- 3 The Real Time Implementation of the Recognition Stage.- 3.1 Introduction.- 3.2 System Overview.- 3.2.1 Functions Overview.- 3.2.2 Architecture Overview.- 3.2.3 System Control and Synchronization Methods.- 3.2.4 System Run-Time Evolution.- 3.2.5 Details on the Asynchronous Stage Activity.- 3.3 Hardware Details.- 3.3.1 DSP Board Description.- DSP board architecture requirements.- DSP board architecture details.- DSP kernel.- 3.3.2 Acquisition Board Description.- Acquisition board requirements.- Acquisition boards architecture details.- Acquisition functions.- 3.3.3 System Configuration.- 3.4 Firmware Blocks Details.- 3.4.1 Feature Extraction.- Generalities.- DSP1 control details.- DSP1 algorithm details.- 3.4.2 Segmentation and Lexical Access.- 3.4.3 Markov Verifier Firmware.- Generalities.- Verification stage details.- 3.5 Some Details on Other System Functions.- 3.5.1 Program Loading and System Testing.- 3.5.2 Acquisition Firmware Details.- 3.5.3 Parameters Training Environment.- 3.6 System Evaluations.- 3.6.1 General Considerations.- 3.6.2 Single-Step Isolated Words Recognition.- 3.6.3 Two-Step Isolated Words Recognition.- 3.6.4 Single-Step Continuous Speech Recognition.- 3.7 Conclusions.- 4 The Understanding Algorithms.- 4.1 Overview.- 4.1.1 Introduction.- 4.1.2 Some Basic Requirements of a Parser for Speech.- 4.1.3 Knowledge Sources from Dependency Rules and Conceptual Graphs.- 4.1.4 The Importance of Control Strategies.- Two reasons for an effective control strategy.- The role of expectations: Integrating top-down and bottom-up parsing strategies.- Deduction instances and search.- Joining deduction instances.- 4.1.5 Control Strategy and Operators.- 4.1.6 Representing Deduction Instances with Memory Structures.- 4.1.7 Implementation, Development System and Results.- 4.2 Representation of Syntax.- 4.2.1 Introduction.- 4.2.2 Interaction Between Syntactic and Semantic Knowledge.- 4.2.3 Dependency Grammar.- Definitions.- An example.- Relations between dependency grammar and context-free grammar.- Remarks on dependency grammars.- 4.2.4 Morphological Agreement Rules.- Structure of agreement rules.- Definition of agreement rules.- Morphological agreement checks.- Morphological features statically associated to words.- Agreement check modalities.- 4.3 Representation of Semantics.- 4.3.1 Introduction.- 4.3.2 Word Information in the Dictionary.- 4.3.3 Caseframes and Conceptual Graphs.- 4.3.4 The use of Conceptual Graphs.- 4.3.5 Representation of the Utterance Meaning.- 4.4 The Compiler of Conceptual Graphs and Dependency Rules.- 4.4.1 Introduction.- 4.4.2 The Use of Dependency Rules.- 4.4.3 Integrating Conceptual Graphs and Dependency Rules - the Mapping Knowledge.- 4.4.4 Combining Different Conceptual Graphs.- 4.4.5 A More Complete Example.- 4.5 Parsing - Conceptual Level.- 4.5.1 Introduction.- 4.5.2 Lexical Component and Model Component.- 4.5.3 Importance of a Score Guided Search.- 4.5.4 Search from the Point of View of the Lexical Component.- Control strategy of the lexical component.- 4.5.5 Relations with the Model Component.- 4.5.6 Relations with some Former Systems.- 4.5.7 The Model Component.- A simplified view: the problem solving paradigm.- The knowledge source partition.- Knowledge sources, facts and goals.- 4.5.8 Deduction Instances.- 4.5.9 Activation: Scores and Quality Factors.- The ACTIVATION operator.- 4.5.10 Control Strategy.- 4.5.11 Optimality and Efficiency.- 4.5.12 The search space and the specialization relation.- 4.5.13 Description of the Operators.- The VERIFY operator.- The SUBGOALING operator.- The PREDICTION operator.- The MERGE operator.- 4.6 Parsing - Memory Structures.- 4.6.1 Introduction.- 4.6.2 Representing DIs with Memory Structures: Some Problems.- 4.6.3 Canonical Deduction Instances.- 4.6.4 Phrase Hypotheses as Representatives of CDIs.- Phrase hypotheses and AND-OR trees.- Phrase hypotheses and contexts.- 4.6.5 Search Space of CDIs and Links Between PHs.- 4.6.6 The VERIFY Operator.- 4.6.7 The SUBGOALING Operator.- 4.6.8 The PREDICTION Operator.- 4.6.9 The MERGE Operator.- How links are exploited.- 4.7 Parsing - Dealing with Missing Words.- 4.7.1 Introduction.- 4.7.2 The Problem.- Types of frequently missing short words.- The basic idea.- The approach: the JVERIFY operator.- 4.7.3 How JVERIFY Works.- Search solving.- Default solving.- Integrating search and default solving.- 4.7.4 When to Apply the JVERIFY Operator.- 4.8 Experimental Results.- 4.8.1 General Performance Results.- The coverage of the language model.- Performance results.- 4.8.2 Performance of the Short Word Treatment.- 4.8.3 Optimality and Efficiency.- 4.8.4 Some Specific Problems.- Excessive gaps and overlaps.- Non-optimality.- Jolly words.- 5 Implementation of a Parallel Logic + Functional Language.- 5.1 Overview.- 5.2 Applications.- 5.3 Languages.- 5.3.1 The Language K-LEAF.- 5.3.2 The Language IDEAL.- 5.3.3 Parallel IDEAL and K-LEAF.- 5.4 Models of Computation.- 5.4.1 Compiling IDEAL into K-LEAF.- 5.4.2 Execution of K-LEAF: Flattening and Outermost SLD-Resolution.- 5.4.3 Parallel Outermost Strategy.- 5.5 Language Implementation and Execution.- 5.5.1 The Parallel Virtual Machine for K-LEAF.- 5.5.2 Basic Compilation Scheme for Outermost Strategy.- 5.5.3 The Actual Compilation Scheme.- 5.5.4 C-Emulation of Sequential K-WAM and Benchmarks.- 5.5.5 Execution of OR-parallel K-LEAF.- 5.5.6 Mapping AND-parallelism into OR-parallelism.- 5.5.7 The Actual Parallel Implementation.- 5.6 Hardware Architecture.- 5.6.1 Architectural Overview.- 5.6.2 The Non-Local Communication Network.- 5.6.3 Performance Evaluation.- 5.6.4 The Switching Element.- 5.6.5 The Physical Prototypes.- 5.7 Conclusions.- 5.7.1 Experience with Programming Style.- 5.7.2 Speed-up.- 6 Conclusions and Future Developments.- 6.1 Recognition Algorithms.- 6.2 Real-time Hardware Implementation.- 6.3 Understanding Algorithms.- 6.4 The Role of a Dialogue Manager.

ISBN: 9783540534020
ISBN-10: 3540534024
Series: Research Reports Esprit / Project 26. Sip
Audience: General
Format: Paperback
Language: English
Number Of Pages: 274
Publisher: Springer-Verlag Berlin and Heidelberg Gmbh & Co. Kg
Country of Publication: DE
Dimensions (cm): 24.41 x 16.99  x 1.55
Weight (kg): 0.47