Learning to Classify Text Using Support Vector Machines : Methods, Theory and Algorithms

Methods, Theory and Algorithms

By: Thorsten Joachims

Write A Review

Hardcover | 1 January 2002

At a Glance

Hardcover
228 Pages

Dimensions(cm)
23.5 x 15.88 x 2.54

Hardcover

$169.75

or 4 interest-free payments of $42.44 with

Ships in 5 to 7 business days

Based on ideas from Support Vector Machines (SVMs), Learning To Classify Text Using Support Vector Machines presents a new approach to generating text classifiers from examples. The approach combines high performance and efficiency with theoretical understanding and improved robustness. In particular, it is highly effective without greedy heuristic components. The SVM approach is computationally efficient in training and classification, and it comes with a learning theory that can guide real-world applications.

Learning To Classify Text Using Support Vector Machines gives a complete and detailed description of the SVM approach to learning text classifiers, including training algorithms, transductive text classification, efficient performance estimation, and a statistical learning model of text classification. In addition, it includes an overview of the field of text classification, making it self-contained even for newcomers to the field. This book gives a concise introduction to SVMs for pattern recognition, and it includes a detailed description of how to formulate text-classification tasks for machine learning.

Shipping

	Standard Shipping	Express Shipping
Metro postcodes:	$9.99	$14.95
Regional postcodes:	$9.99	$14.95
Rural postcodes:	$9.99	$14.95

Orders over $0.00 qualify for free shipping.

How to return your order

At Booktopia, we offer hassle-free returns in accordance with our returns policy. If you wish to return an item, please get in touch with Booktopia Customer Care.

Additional postage charges may be applicable.

Defective items

If there is a problem with any of the items received for your order then the Booktopia Customer Care team is ready to assist you.

For more info please visit our Help Centre.

You Can Find This Book In

Non-Fiction Computing & I.T.Computer Science Mathematical Theory of Computation Graphical & Digital Media Applications Engineering & Technology Technology in General Computer Programming & Software Development Programming & Scripting Languages Medicine Medicine in General Artificial Intelligence

Machine Learning Business Applications Word Processing Software

Foreword	p. xi
Preface	p. xiii
Acknowledgments	p. xv
Notation	p. xvii
Introduction	p. 1
Challenges	p. 2
Goals	p. 3
Overview and Structure of the Argument	p. 4
Theory	p. 4
Methods	p. 5
Algorithms	p. 6
Summary	p. 6
Text Classification	p. 7
Learning Task	p. 7
Binary Setting	p. 8
Multi-Class Setting	p. 9
Multi-Label Setting	p. 10
Representing Text	p. 12
Word Level	p. 13
Sub-Word Level	p. 15
Multi-Word Level	p. 15
Semantic Level	p. 16
Feature Selection	p. 16
Feature Subset Selection	p. 17
Feature Construction	p. 19
Term Weighting	p. 20
Conventional Learning Methods	p. 22
Naive Bayes Classifier	p. 22
Rocchio Algorithm	p. 24
[kappa]-Nearest Neighbors	p. 25
Decision Tree Classifier	p. 25
Other Methods	p. 26
Performance Measures	p. 27
Error Rate and Asymmetric Cost	p. 28
Precision and Recall	p. 29
Precision/Recall Breakeven Point and F[subscript [beta]-Measure	p. 30
Micro- and Macro-Averaging	p. 30
Experimental Setup	p. 31
Test Collections	p. 31
Design Choices	p. 32
Support Vector Machines	p. 35
Linear Hard-Margin SVMs	p. 36
Soft-Margin SVMs	p. 39
Non-Linear SVMs	p. 41
Asymmetric Misclassification Cost	p. 43
Other Maximum-Margin Methods	p. 43
Further Work and Further Information	p. 44
Part Theory
A Statistical Learning Model of Text Classification for SVMs	p. 45
Properties of Text-Classification Tasks	p. 46
High-Dimensional Feature Space	p. 46
Sparse Document Vectors	p. 47
Heterogeneous Use of Terms	p. 47
High Level of Redundancy	p. 48
Frequency Distribution of Words and Zipf's Law	p. 49
A Discriminative Model of Text Classification	p. 51
Step 1: Bounding the Expected Error Based on the Margin	p. 51
Step 2: Homogeneous TCat-Concepts as a Model of Text-Classification Tasks	p. 53
Step 3: Learnability of TCat-Concepts	p. 59
Comparing the Theoretical Model with Experimental Results	p. 64
Sensitivity Analysis: Difficult and Easy Learning Tasks	p. 66
Influence of Occurrence Frequency	p. 66
Discriminative Power of Term Sets	p. 68
Level of Redundancy	p. 68
Noisy TCat-Concepts	p. 69
Limitations of the Model and Open Questions	p. 72
Related Work	p. 72
Summary and Conclusions	p. 74
Efficient Performance Estimators for SVMs	p. 75
Generic Performance Estimators	p. 76
Training Error	p. 76
Hold-Out Testing	p. 77
Bootstrap and Jackknife	p. 78
Cross-Validation and Leave-One-Out	p. 79
[xi alpha]-Estimators	p. 81
Error Rate	p. 82
Recall, Precision, and F[subscript 1]	p. 89
Fast Leave-One-Out Estimation	p. 93
Experiments	p. 94
How Large are Bias and Variance of the [xi alpha]-Estimators?	p. 95
What is the Influence of the Training Set Size?	p. 99
How Large is the Efficiency Improvement for Exact Leave-One-Out?	p. 101
Summary and Conclusions	p. 102
Part Methods
Inductive Text Classification	p. 103
Learning Task	p. 104
Automatic Model and Parameter Selection	p. 105
Leave-One-Out Estimator of the PRBEP	p. 106
[xi alpha]-Estimator of the PRBEP	p. 106
Model-Selection Algorithm	p. 108
Experiments	p. 108
Word Weighting, Stemming and Stopword Removal	p. 108
Trading Off Training Error vs. Complexity	p. 111
Non-Linear Classification Rules	p. 113
Comparison with Conventional Methods	p. 113
Related Work	p. 116
Summary and Conclusions	p. 117
Transductive Text Classification	p. 119
Learning Task	p. 120
Transductive Support Vector Machines	p. 121
What Makes TSVMs well Suited for Text Classification?	p. 123
An Intuitive Example	p. 123
Transductive Learning of TCat-Concepts	p. 125
Experiments	p. 127
Constraints on the Transductive Hyperplane	p. 130
Relation to Other Approaches Using Unlabeled Data	p. 133
Probabilistic Approaches using EM	p. 133
Co-Training	p. 134
Other Work on Transduction	p. 139
Summary and Conclusions	p. 139
Part Algorithms
Training Inductive Support Vector Machines	p. 141
Problem and Approach	p. 142
General Decomposition Algorithm	p. 143
Selecting a Good Working Set	p. 145
Convergence	p. 145
How to Compute the Working Set	p. 146
Shrinking: Reducing the Number of Variables	p. 146
Efficient Implementation	p. 148
Termination Criteria	p. 148
Computing the Gradient and the Termination Criteria Efficiently	p. 149
What are the Computational Resources Needed in each Iteration?	p. 150
Caching Kernel Evaluations	p. 151
How to Solve the QP on the Working Set	p. 152
Related Work	p. 152
Experiments	p. 154
Training Times for Reuters, WebKB, and Ohsumed	p. 154
How does Training Time Scale with the Number of Training Examples?	p. 154
What is the Influence of the Working-Set-Selection Strategy?	p. 160
What is the Influence of Caching?	p. 161
What is the Influence of Shrinking?	p. 161
Summary and Conclusions	p. 162
Training Transductive Support Vector Machines	p. 163
Problem and Approach	p. 163
The TSVM Algorithm	p. 165
Analysis of the Algorithm	p. 166
How does the Algorithm work?	p. 166
Convergence	p. 168
Experiments	p. 169
Does the Algorithm Effectively Maximize Margin?	p. 169
Training Times for Reuters, WebKB, and Ohsumed	p. 170
How does Training Time Scale with the Number of Training Examples?	p. 170
How does Training Time Scale with the Number of Test Examples?	p. 172
Related Work	p. 172
Summary and Conclusions	p. 174
Conclusions	p. 175
Open Question	p. 177
Bibliography	p. 180
Appendices	p. 197
SVM-Light Commands and Options	p. 197
Index	p. 203
Table of Contents provided by Syndetics. All Rights Reserved.

Learning to Classify Text Using Support Vector Machines : Methods, Theory and Algorithms

Methods, Theory and Algorithms

At a Glance

Hardcover

Shipping

How to return your order

Defective items

You Can Find This Book In

More in Word Processing Software

Microsoft 365 Word For Dummies

Word for Dummies

Microsoft 365 Word For Professionals For Dummies

For Dummies (Computer/Tech)

Microsoft 365 in easy steps

In Easy Steps

Scrivener For Dummies

For Dummies (Computer/Tech)

Writing Word Macros

O'Reilly Ser.

Managing Change in Organizations

6th edition

Regular Expression Pocket Reference

Pocket Reference (O'Reilly)

Regular Expressions Cookbook

OREILLY AND ASSOCIATE

Adobe Acrobat 6 PDF For Dummies

For Dummies Series

MOS Study Guide for Microsoft Word Exam MO-100

MOS Study Guide

MOS Study Guide for Microsoft Word Expert Exam MO-101

MOS Study Guide

Gregg College Keyboarding & Document Processing (GDP); Lessons 1-120, main text

P.S. Keyboarding

Gregg College Keyboarding & Document Processing (GDP); Lessons 1-60 text

P.S. Keyboarding

Word 2019 For Dummies

Word for Dummies

New Perspectives MicrosoftA®Office 365 & WordA® 2019 Comprehensive

Mindtap Course List

Shelly Cashman Series Microsoft Office 365 & Word 2019 Comprehensive

Mindtap Course List

MCA Microsoft Office Specialist (Office 365 and Office 2019) Study Guide

Word Associate Exam MO-100

Teach Yourself VISUALLY Word 2019

Teach Yourself VISUALLY (Tech)

TEX from Square 1

Microsoft Word Step by Step (Office 2021 and Microsoft 365)

1st Edition

Word 2016 for Professionals For Dummies

For Dummies (Computers)

Microsoft Office Word 2010 Manual to Accompany College Keyboarding & Document Processing

Doing Your Dissertation with Microsoft(R) Word

A Comprehensive Guide to Using Microsoft(R) Word for Academic Writing (Updated for Word 2007 & 2010)

Practical Vim

Edit Text at the Speed of Thought : 2nd Edition

This product is categorised by