Treebanks

Building and Using Parsed Corpora

By: A. AbeillÃ© (Editor)

Write A Review

Hardcover | 30 September 2003

At a Glance

Hardcover
440 Pages

Dimensions(cm)
24.13 x 15.88 x 1.91

Hardcover

$249.75

or 4 interest-free payments of $62.44 with

Ships in 7 to 10 business days

Linguists and engineers in Natural Language Processing tend to use electronic corpora more and more. Most research has long been limited to raw (unannotated) texts or to tagged texts (annotated with parts of speech only), but these approaches suffer from a word by word perspective. A new line of research involves corpora with richer annotations such as clauses and major constituents, grammatical functions and dependency links. The first parsed corpora were the English Lancaster treebank and Penn Treebank. New ones have recently been developed for other languages.
This book:

provides a state of the art on work being done with parsed corpora;

gathers 21 papers on building and using parsed corpora raising many relevant questions;

deals with a variety of languages and a variety of corpora;

is for those working in linguistics, computational linguistics, natural language, syntax, and grammar.

Industry Reviews

From the reviews:

"Anne Abeille draws together a collection of fifteen short pieces focused primarily on the issues that come up in creating treebanks, demonstrated across an impressive variety of languages, along with six chapters on how treebanks are used. ... For computational linguists working on automatic parsing, a pass through this book should be required ... . The reader ... will be rewarded with a clear sense of the challenge and the promise of systematically applying theoretically motivated linguistic representations to 'language in the large'." (Philip Resnik, Language, Vol. 83 (4), 2007)

Shipping

	Standard Shipping	Express Shipping
Metro postcodes:	$9.99	$14.95
Regional postcodes:	$9.99	$14.95
Rural postcodes:	$9.99	$14.95

Orders over $79.00 qualify for free shipping.

How to return your order

At Booktopia, we offer hassle-free returns in accordance with our returns policy. If you wish to return an item, please get in touch with Booktopia Customer Care.

Additional postage charges may be applicable.

Defective items

If there is a problem with any of the items received for your order then the Booktopia Customer Care team is ready to assist you.

For more info please visit our Help Centre.

You Can Find This Book In

Non-Fiction Computing & I.T.Computer Science Artificial Intelligence Natural Language & Machine Translation Engineering & Technology Technology in General Psychology Cognition & Cognitive Psychology Language & Linguistics Language Learning & Teaching Specific Skills of Learning Language

Speaking & Pronounciation Skills Linguistics Grammar, Syntax & Morphology

Preface	p. xi
Introduction	p. xiii
Building Treebanks	p. xv
Using treebanks	p. xix
Building treebanks
English Treebanks
The Penn Treebank: An Overview	p. 5
The annotation schemes	p. 6
Methodology	p. 16
Conclusions	p. 20
Thoughts on Two Decades of Drawing Trees	p. 23
Historical background	p. 23
Building treebanks	p. 26
Exploiting the SUSANNE Treebank	p. 29
Small is beautiful	p. 33
Annotating a spoken corpus	p. 35
Using the CHRISTINE Corpus	p. 38
Conclusion	p. 40
Bank of English and Beyond	p. 43
Introduction	p. 43
Annotating 200 million words	p. 44
ENGCG Syntax	p. 52
FDG parser	p. 54
Conclusion	p. 56
Completing Parsed Corpora from Correction to Evolution	p. 61
Introduction	p. 61
Conventional post-correction	p. 63
A paradigm shift: transverse correction	p. 65
Critique	p. 68
German Treebanks
Syntactic Annotation of A German Newspaper Corpus	p. 73
Introduction	p. 73
Treebank development	p. 74
Corpus annotation	p. 77
Applications	p. 83
Conclusions	p. 83
Tagsets	p. 87
Annotation of Error Types for A German Newsgroup Corpus	p. 89
Introduction	p. 89
Corpus Description	p. 90
Annotation Strategy	p. 91
Annotation Tools	p. 93
Evaluation	p. 96
First Results	p. 98
Conclusion	p. 99
Slavic Treebanks
The PDT: A 3-Level Annotation Scenario	p. 103
The Prague Dependency Treebank	p. 103
Morphological Level	p. 104
Analytical Level	p. 106
Merging the Morphological and the Analytical Syntactic Level	p. 114
Tectogrammatical Level	p. 114
PDT versions 1.0 and 2.0	p. 121
Conclusion	p. 122
Appendix	p. 126
An HPSG-Annotated Test Suite for Polish	p. 129
Aims and design constraints	p. 129
Correctness and complexity markers	p. 130
Linguistic phenomena	p. 131
Annotation schema	p. 136
Implementation issues	p. 137
Conclusion	p. 143
Treebanks for Romance Languages
Developing A Spanish Treebank	p. 149
Introduction	p. 149
Data selection	p. 150
Annotation scheme	p. 151
Tools	p. 157
Debugging and error statistics	p. 158
Current state and future development	p. 159
Sample of trees	p. 163
Building A Treebank for French	p. 165
The tagging phase	p. 166
The parsing phase	p. 173
Current state and future work	p. 180
Conclusion	p. 181
Appendix	p. 185
Building the Italian Syntactic-Semantic Treebank	p. 189
Introduction	p. 190
ISST architecture	p. 190
ISST corpus	p. 191
ISST morpho-syntactic annotation	p. 191
ISST syntactic annotation	p. 192
ISST lexico-semantic annotation	p. 196
The multi-level linguistic annotation tool	p. 200
ISST evaluation	p. 204
Conclusion	p. 206
Appendix	p. 209
Automated Creation of A Medieval Portuguese Treebank	p. 211
Introduction	p. 211
The parsed corpus of medieval portuguese texts	p. 212
Tools and computational resources	p. 215
Evaluation	p. 222
Conclusion	p. 224
Treebanks for Other Languages
Sinica Treebank	p. 231
Introduction	p. 231
Design criteria	p. 232
Representation of lexico-grammatical information: ICG	p. 233
Annotation guideline	p. 235
Implementation	p. 239
Representational issues: problematic cases and how they are solved	p. 241
Current status of the sinica treebank and future work	p. 243
Syntactic Categories	p. 248
Building A Japanese Parsed Corpus	p. 249
Introduction	p. 249
Overview of the project	p. 250
Morphological analyzer JUMAN	p. 253
Dependency structure analyzer KNP	p. 255
Conclusion	p. 259
Building A Turkish Treebank	p. 261
Turkish: Morphology and syntax	p. 262
What information needs to be represented?	p. 263
The annotation tool	p. 270
Some difficult issues	p. 272
Conclusions and future work	p. 273
Turkish Morphological Features	p. 276
Using treebanks
Encoding Syntactic Annotation	p. 281
Introduction	p. 281
XCES	p. 283
Syntactic annotation: current practice	p. 284
A model for syntactic annotation	p. 286
Using the XCES scheme	p. 291
Conclusion	p. 293
Evaluation with Treebanks
Parser Evaluation	p. 299
Introduction	p. 299
Grammatical relation annotation	p. 302
Corpus annotation	p. 308
Parser evaluation	p. 309
Discussion	p. 312
Summary	p. 313
Dependency-Based Evaluation of Minipar	p. 317
Introduction	p. 317
Dependency-based parser evaluation	p. 318
Evaluation of minipar with susanne corpus	p. 320
Selective evaluation	p. 323
Related work	p. 326
Conclusions	p. 328
Grammar Induction with Treebanks
Extracting Stochastic Grammars from Treebanks	p. 333
Introduction	p. 333
Summary of data-oriented parsing	p. 335
Simulating stochastic grammars by constraining the subtree set	p. 337
Discussion and conclusion	p. 344
Stochastic Lexicalized Tree Grammars	p. 351
Introduction	p. 351
Related work	p. 352
Grammar extraction	p. 353
SLTG from treebanks	p. 355
SLTG from HPSG	p. 359
Future steps: towards merging SLTGs	p. 362
From Treebank Resources to LFG F-Structures	p. 367
Introduction	p. 368
Methods for automatic f-structure annotation	p. 370
Two Experiments	p. 380
Discussion and Current Research	p. 383
Summary	p. 385
Example of an Automatically Generated F-Structure (Susanne Corpus)	p. 389
Contributing Authors	p. 391
Index	p. 398
Table of Contents provided by Ingram. All Rights Reserved.

Treebanks

Building and Using Parsed Corpora

At a Glance

Hardcover

Industry Reviews

Shipping

How to return your order

Defective items

You Can Find This Book In

More in Speaking & Pronounciation Skills

Rebel Talk

The Art of Powerful Conversations

German Beginner's course : Willkommen! 1

Coursebook : 3rd Edition

Introduction to Speech, Language and Literacy

includes interactive e-book

Complete Swedish (Learn Swedish with Teach Yourself)

Book with Audio Online

Complete Russian (Learn Russian With Teach Yourself)

Teach Yourself

Community Language Interpreting

A Workbook

Communicating for Results

A Guide for Business and the Professions, 11th Edition

Language for Behaviour and Emotions

A Practical Guide to Working with Children and Young People

How to Talk to Anyone

92 Little Tricks for Big Success in Relationships

Verbal Judo, Second Edition

The Gentle Art of Persuasion

Complete Mandarin Chinese (Learn Mandarin Chinese with Teach Yourself)

Learn to read, write, speak and understand Mandarin Chinese

Debating in Teaching and Learning English

Theory and Practice for Pedagogy and Curriculum

Using the Systems Approach for Aphasia

An Introduction for Speech and Language Therapists

The Routledge Handbook of Conversation Analysis

Routledge Handbooks in Applied Linguistics

Disfluencies We Live with in Japanese

An Interdisciplinary Approach

Communication Between Cultures

Mastering the Four Arguments

The Classical Technique That Will Help You Write Persuasively

EXCEL ENGLISH BOOK 4: ENDING CONSONANT SOUNDS WORKBOOK

EARLY SERIES AGE 4-5

EXCEL ENGLISH BOOK 5: VOWEL SOUNDS WORKBOOK

EARLY SERIES AGE 4-5

Promoting Language & Literacy

In Children Who Are Deaf or Hard of Hearing

15 Minute German

Learn in Just 12 Weeks

Are You Ready for Kindergarten? Verbal Skills

Verbal Skills

Voice and Speaking Skills For Dummies

For Dummies

Experiencing Speech

A Skills-Based, Panlingual Approach to Actor Training: A Beginner's Guide to Knight-Thompson Speechwork®

This product is categorised by