+612 9045 4394
Treebanks : Building and Using Parsed Corpora - Anne Abeille


Building and Using Parsed Corpora

By: Anne Abeille (Editor)

Paperback Published: 30th September 2003
ISBN: 9781402013355
Number Of Pages: 407

Share This Book:


or 4 easy payments of $36.76 with Learn more
Ships in 7 to 10 business days

Other Available Editions (Hide)

  • Hardcover View Product Published: 30th September 2003
    Ships in 7 to 10 business days

Linguists and engineers in Natural Language Processing tend to use electronic corpora more and more. Most research has long been limited to raw (unannotated) texts or to tagged texts (annotated with parts of speech only), but these approaches suffer from a word by word perspective. A new line of research involves corpora with richer annotations such as clauses and major constituents, grammatical functions and dependency links. The first parsed corpora were the English Lancaster treebank and Penn Treebank. New ones have recently been developed for other languages.
This book:

provides a state of the art on work being done with parsed corpora;

gathers 21 papers on building and using parsed corpora raising many relevant questions;

deals with a variety of languages and a variety of corpora;

is for those working in linguistics, computational linguistics, natural language, syntax, and grammar.

Industry Reviews

From the reviews:

"Anne Abeille draws together a collection of fifteen short pieces focused primarily on the issues that come up in creating treebanks, demonstrated across an impressive variety of languages, along with six chapters on how treebanks are used. ... For computational linguists working on automatic parsing, a pass through this book should be required ... . The reader ... will be rewarded with a clear sense of the challenge and the promise of systematically applying theoretically motivated linguistic representations to 'language in the large'." (Philip Resnik, Language, Vol. 83 (4), 2007)

Building Treebanks. English Treebanks
The Penn treebank: an overview
Thoughts on two decades of drawing trees
Bank of English and beyond
Completing parsed corpora from correction to evolution
Syntactic annotation of a German newspaper corpus
Annotation of error types for a German newsgroup corpus
The PDT: a 3-level annotation scenario
An HPSG-annotated test suite for Polish
Developing a Spanish treebank
Building a treebank for French
Building the Italian syntactic-semantic treebank
Automated creation of a medieval Portugese treebank
Sinica treebank
Building a Japanese parsed corpus
Building a Turkish treebank
Using Treebanks
Encoding syntactic annotation
Parser evaluation
Dependency-based evaluation of minipar
Extracting stochastic grammars from treebanks
Stochastic lexicalized tree grammars
From treebank resources to LFG f-structures
Contributing Authors
Table of Contents provided by Publisher. All Rights Reserved.

ISBN: 9781402013355
ISBN-10: 1402013353
Series: Text, Speech and Language Technology
Audience: General
Format: Paperback
Language: English
Number Of Pages: 407
Published: 30th September 2003
Publisher: Springer-Verlag New York Inc.
Country of Publication: US
Dimensions (cm): 23.32 x 15.9  x 2.41
Weight (kg): 0.62