| Preface | p. V |
| Point Estimation Algorithms | p. 1 |
| Introduction | p. 1 |
| Motivation | p. 2 |
| Methods of Point Estimation | p. 2 |
| The Method of Moments | p. 2 |
| Maximum Likelihood Estimation | p. 4 |
| The Expectation-Maximization Algorithm | p. 6 |
| Measures of Performance | p. 8 |
| Bias | p. 9 |
| Mean Squared Error | p. 9 |
| Standard Error | p. 10 |
| Efficiency | p. 10 |
| Consistency | p. 11 |
| The Jackknife Method | p. 11 |
| Summary | p. 13 |
| Applications of Bayes Theorem | p. 15 |
| Introduction | p. 15 |
| Motivation | p. 16 |
| The Bayes Approach for Classification | p. 17 |
| Statistical Framework for Classification | p. 17 |
| Bayesian Methodology | p. 20 |
| Examples | p. 22 |
| Example 1: Numerical Methods | p. 22 |
| Example 2: Bayesian Networks | p. 24 |
| Summary | p. 25 |
| Similarity Measures | p. 27 |
| Introduction | p. 27 |
| Motivation | p. 28 |
| Classic Similarity Measures | p. 28 |
| Dice | p. 30 |
| Overlap | p. 30 |
| Jaccard | p. 31 |
| Asymmetric | p. 31 |
| Cosine | p. 31 |
| Other Measures | p. 32 |
| Dissimilarity | p. 32 |
| Example | p. 33 |
| Current Applications | p. 35 |
| Multi-Dimensional Modeling | p. 35 |
| Hierarchical Clustering | p. 36 |
| Bioinformatics | p. 37 |
| Summary | p. 38 |
| Decision Trees | p. 39 |
| Introduction | p. 39 |
| Motivation | p. 41 |
| Decision Tree Algorithms | p. 42 |
| ID3 Algorithm | p. 43 |
| Evaluating Tests | p. 43 |
| Selection of Splitting Variable | p. 46 |
| Stopping Criteria | p. 46 |
| Tree Pruning | p. 47 |
| Stability of Decision Trees | p. 47 |
| Example: Classification of University Students | p. 48 |
| Applications of Decision Tree Algorithms | p. 49 |
| Summary | p. 50 |
| Genetic Algorithms | p. 53 |
| Introduction | p. 53 |
| Motivation | p. 54 |
| Fundamentals | p. 55 |
| Encoding Schema and Initialization | p. 56 |
| Fitness Evaluation | p. 57 |
| Selection | p. 58 |
| Crossover | p. 59 |
| Mutation | p. 61 |
| Iterative Evolution | p. 62 |
| Example: The Traveling-Salesman | p. 63 |
| Current and Future Applications | p. 65 |
| Summary | p. 66 |
| Classification: Distance-based Algorithms | p. 67 |
| Introduction | p. 67 |
| Motivation | p. 68 |
| Distance Functions | p. 68 |
| City Block Distance | p. 69 |
| Euclidean Distance | p. 70 |
| Tangent Distance | p. 70 |
| Other Distances | p. 71 |
| Classification Algorithms | p. 72 |
| A Simple Approach Using Mean Vector | p. 72 |
| K-Nearest Neighbors | p. 74 |
| Current Applications | p. 76 |
| Summary | p. 77 |
| Decision Tree-based Algorithms | p. 79 |
| Introduction | p. 79 |
| Motivation | p. 80 |
| ID3 | p. 80 |
| C4.5 | p. 82 |
| C5.0 | p. 83 |
| CART | p. 84 |
| Summary | p. 85 |
| Covering (Rule-based) Algorithms | p. 87 |
| Introduction | p. 87 |
| Motivation | p. 88 |
| Classification Rules | p. 88 |
| Covering (Rule-based) Algorithms | p. 90 |
| 1R Algorithm | p. 91 |
| PRISM Algorithm | p. 94 |
| Other Algorithms | p. 96 |
| Applications of Covering Algorithms | p. 97 |
| Summary | p. 97 |
| Clustering: An Overview | p. 99 |
| Introduction | p. 99 |
| Motivation | p. 100 |
| The Clustering Process | p. 100 |
| Pattern Representation | p. 101 |
| Pattern Proximity Measures | p. 102 |
| Clustering Algorithms | p. 103 |
| Hierarchical Algorithms | p. 103 |
| Partitional Algorithms | p. 105 |
| Data Abstraction | p. 105 |
| Cluster Assessment | p. 105 |
| Current Applications | p. 107 |
| Summary | p. 107 |
| Clustering: Hierarchical Algorithms | p. 109 |
| Introduction | p. 109 |
| Motivation | p. 110 |
| Agglomerative Hierarchical Algorithms | p. 111 |
| The Single Linkage Method | p. 112 |
| The Complete Linkage Method | p. 114 |
| The Average Linkage Method | p. 116 |
| The Centroid Method | p. 116 |
| The Ward Method | p. 117 |
| Divisive Hierarchical Algorithms | p. 118 |
| Summary | p. 120 |
| Clustering: Partitional Algorithms | p. 121 |
| Introduction | p. 121 |
| Motivation | p. 122 |
| Partitional Clustering Algorithms | p. 122 |
| Squared Error Clustering | p. 122 |
| Nearest Neighbor Clustering | p. 126 |
| Partitioning Around Medoids | p. 127 |
| Self-Organizing Maps | p. 131 |
| Current Applications | p. 132 |
| Summary | p. 132 |
| Clustering: Large Databases | p. 133 |
| Introduction | p. 133 |
| Motivation | p. 134 |
| Requirements for Scalable Clustering | p. 134 |
| Major Approaches to Scalable Clustering | p. 135 |
| The Divide-and-Conquer Approach | p. 135 |
| Incremental Clustering Approach | p. 135 |
| Parallel Approach to Clustering | p. 136 |
| BIRCH | p. 137 |
| DBSCAN | p. 139 |
| CURE | p. 140 |
| Summary | p. 141 |
| Clustering: Categorical Attributes | p. 143 |
| Introduction | p. 143 |
| Motivation | p. 144 |
| ROCK Clustering Algorithm | p. 145 |
| Computation of Links | p. 146 |
| Goodness Measure | p. 147 |
| Miscellaneous Issues | p. 148 |
| Example | p. 148 |
| COOLCAT Clustering Algorithm | p. 149 |
| CACTUS Clustering Algorithm | p. 151 |
| Summary | p. 152 |
| Association Rules: An Overview | p. 153 |
| Introduction | p. 153 |
| Motivation | p. 154 |
| Association Rule Process | p. 154 |
| Terminology and Notation | p. 154 |
| From Data to Association Rules | p. 157 |
| Large Itemset Discovery Algorithms | p. 158 |
| Apriori | p. 158 |
| Sampling | p. 160 |
| Partitioning | p. 162 |
| Summary | p. 163 |
| Association Rules: Parallel and Distributed Algorithms | p. 169 |
| Introduction | p. 169 |
| Motivation | p. 170 |
| Parallel and Distributed Algorithms | p. 171 |
| Data Parallel Algorithms on Distributed Memory Systems | p. 172 |
| Count Distribution (CD) | p. 172 |
| Task Parallel Algorithms on Distributed Memory Systems | p. 174 |
| Data Distribution (DD) | p. 174 |
| Candidate Distribution (CaD) | p. 174 |
| Intelligent Data Distribution (IDD) | p. 175 |
| Data Parallel Algorithms on Shared Memory Systems | p. 176 |
| Common Candidate Partitioned Database (CCPD) | p. 176 |
| Task Parallel Algorithms on Shared Memory Systems | p. 177 |
| Asynchronous Parallel Mining (APM) | p. 177 |
| Discussion of Parallel Algorithms | p. 177 |
| Summary | p. 179 |
| Association Rules: Advanced Techniques and Measures | p. 183 |
| Introduction | p. 183 |
| Motivation | p. 184 |
| Incremental Rules | p. 184 |
| Generalized Association Rules | p. 185 |
| Quantitative Association Rules | p. 187 |
| Correlation Rules | p. 188 |
| Measuring the Quality of Association Rules | p. 189 |
| Lift | p. 189 |
| Conviction | p. 189 |
| Chi-Squared Test | p. 190 |
| Summary | p. 191 |
| Spatial Mining: Techniques and Algorithms | p. 193 |
| Introduction and Motivation | p. 193 |
| Concept Hierarchies and Generalization | p. 194 |
| Spatial Rules | p. 196 |
| STING | p. 197 |
| Spatial Classification | p. 199 |
| ID3 Extension | p. 200 |
| Two-Step Method | p. 201 |
| Spatial Clustering | p. 202 |
| CLARANS | p. 202 |
| GDBSCAN | p. 203 |
| DBCLASD | p. 204 |
| Summary | p. 204 |
| References | p. 207 |
| Index | p. 219 |
| Table of Contents provided by Ingram. All Rights Reserved. |