+612 9045 4394
Memory-Based Language Processing : Studies in Natural Language Processing - Walter Daelemans

Memory-Based Language Processing

Studies in Natural Language Processing

Hardcover Published: 31st October 2005
ISBN: 9780521808903
Number Of Pages: 198

Share This Book:


or 4 easy payments of $57.31 with Learn more
Ships in 7 to 10 business days

Memory-based language processing - a machine learning and problem solving method for language technology - is based on the idea that the direct re-use of examples using analogical reasoning is more suited for solving language processing problems than the application of rules extracted from those examples. This book discusses the theory and practice of memory-based language processing, showing its comparative strengths over alternative methods of language modelling. Language is complex, with few generalizations, many sub-regularities and exceptions, and the advantage of memory-based language processing is that it does not abstract away from this valuable low-frequency information. By applying the model to a range of benchmark problems, the authors show that for linguistic areas ranging from phonology to semantics, it produces excellent results. They also describe TiMBL, a software package for memory-based language processing. The first comprehensive overview of the approach, this book will be invaluable for computational linguists, psycholinguists and language engineers.

Prefacep. 1
Memory-Based Learning in Natural Language Processingp. 3
Natural language processing as classificationp. 6
A linguistic examplep. 9
Roadmap and softwarep. 12
Further readingp. 14
Inspirations from linguistics and artificial intelligencep. 15
Inspirations from linguisticsp. 15
Inspirations from artificial intelligencep. 21
Memory-based language processing literaturep. 22
Conclusionp. 24
Memory and Similarityp. 26
German plural formationp. 27
Similarity metricp. 28
Information-theoretic feature weightingp. 29
Alternative feature weighting methodsp. 31
Getting started with TiMBLp. 32
Feature weighting in TiMBLp. 36
Modified value difference metricp. 38
Value clustering in TiMBLp. 39
Distance-weighted class votingp. 42
Distance-weighted class voting in TiMBLp. 44
Analyzing the output of MBLPp. 45
Displaying nearest neighbors in TiMBLp. 45
Implementation issuesp. 46
TiMBL treesp. 47
Methodologyp. 47
Experimental methodology in TiMBLp. 48
Additional performance measures in TiMBLp. 52
Conclusionp. 55
Application to morpho-phonologyp. 57
Phonemizationp. 59
Memory-based word phonemizationp. 59
TreeTalkp. 60
IGTree in TiMBLp. 67
Experiments: applying IGTree to word phonemizationp. 69
TRIBL: trading memory for speedp. 71
TRIBL in TiMBLp. 73
Morphological analysisp. 73
Dutch morphologyp. 74
Feature and class encodingp. 74
Experiments: MBMA on Dutch wordformsp. 76
Conclusionp. 80
Further readingp. 83
Application to shallow parsingp. 85
Part-of-speech taggingp. 86
Memory-based tagger architecturep. 87
Resultsp. 88
Memory-based tagging with Mbt and Mbtgp. 90
Constituent chunkingp. 96
Resultsp. 96
Using Mbt and Mbtg for chunkingp. 97
Relation findingp. 99
Relation finder architecturep. 99
Resultsp. 100
Conclusionp. 101
Further readingp. 102
Abstraction and generalizationp. 104
Lazy versus eager learningp. 106
Benchmark language learning tasksp. 107
Forgetting by rule induction is harmful in language learningp. 111
Editing examplesp. 115
Why forgetting examples can be harmfulp. 123
Generalizing examplesp. 128
Careful abstraction in memory-based learningp. 128
Getting started with FAMBLp. 135
Experiments with FAMBLp. 137
Conclusionp. 143
Further readingp. 145
Extensionsp. 148
Wrapped progressive samplingp. 149
The wrapped progressive sampling algorithmp. 150
Getting started with wrapped progressive samplingp. 152
Wrapped progressive sampling resultsp. 154
Optimizing output sequencesp. 156
Stackingp. 157
Predicting class n-gramsp. 160
Combining stacking and class n-gramsp. 162
Summaryp. 164
Conclusionp. 164
Further readingp. 165
Bibliographyp. 168
Indexp. 186
Table of Contents provided by Ingram. All Rights Reserved.

ISBN: 9780521808903
ISBN-10: 0521808901
Series: Studies in Natural Language Processing
Audience: Professional
Format: Hardcover
Language: English
Number Of Pages: 198
Published: 31st October 2005
Publisher: Cambridge University Press
Country of Publication: GB
Dimensions (cm): 22.9 x 15.2  x 1.6
Weight (kg): 0.46