| Introduction | p. 1 |
| What Is Data Mining? | p. 3 |
| Some More Real-World Applications | p. 3 |
| Data Mining Methods - An Overview | p. 6 |
| Basic Problem Types | p. 6 |
| Prediction | p. 6 |
| Classification | p. 6 |
| Regression | p. 7 |
| Knowlegde Discovery | p. 7 |
| Deviation Detection | p. 7 |
| Cluster Analysis | p. 7 |
| Visualization | p. 8 |
| Association Rules | p. 8 |
| Segmentation | p. 8 |
| Data Mining Viewed from the Data Side | p. 9 |
| Types of Data | p. 10 |
| Conclusion | p. 11 |
| Data Preparation | p. 13 |
| Data Cleaning | p. 13 |
| Handling Outlier | p. 14 |
| Handling Noisy Data | p. 14 |
| Missing Values Handling | p. 16 |
| Coding | p. 16 |
| Recognition of Correlated or Redundant Attributes | p. 16 |
| Abstraction | p. 17 |
| Attribute Construction | p. 17 |
| Images | p. 17 |
| Time Series | p. 18 |
| Web Data | p. 19 |
| Conclusions | p. 22 |
| Methods for Data Mining | p. 23 |
| Decision Tree Induction | p. 23 |
| Basic Principle | p. 23 |
| Terminology of Decision Tree | p. 24 |
| Subtasks and Design Criteria for Decision Tree Induction | p. 25 |
| Attribute Selection Criteria | p. 28 |
| Information Gain Criteria and Gain Ratio | p. 29 |
| Gini Function | p. 30 |
| Discretization of Attribute Values | p. 31 |
| Binary Discretization | p. 32 |
| Multi-interval Discretization | p. 34 |
| Discretization of Categorical or Symbolical Attributes | p. 41 |
| Pruning | p. 42 |
| Overview | p. 43 |
| Cost-Complexity Pruning | p. 43 |
| Some General Remarks | p. 44 |
| Summary | p. 46 |
| Case-Based Reasoning | p. 46 |
| Background | p. 47 |
| The Case-Based Reasoning Process | p. 47 |
| CBR Maintenance | p. 48 |
| Knowledge Containers in a CBR System | p. 49 |
| Design Consideration | p. 50 |
| Similarity | p. 50 |
| Formalization of Similarity | p. 50 |
| Similarity Measures | p. 51 |
| Similarity Measures for Images | p. 51 |
| Case Description | p. 53 |
| Organization of Case Base | p. 53 |
| Learning in a CBR System | p. 55 |
| Learning of New Cases and Forgetting of Old Cases | p. 56 |
| Learning of Prototypes | p. 56 |
| Learning of Higher Order Constructs | p. 56 |
| Learning of Similarity | p. 56 |
| Conclusions | p. 57 |
| Clustering | p. 57 |
| Introduction | p. 57 |
| General Comments | p. 58 |
| Distance Measures for Metrical Data | p. 59 |
| Using Numerical Distance Measures for Categorical Data | p. 60 |
| Distance Measure for Nominal Data | p. 61 |
| Contrast Rule | p. 62 |
| Agglomerate Clustering Methods | p. 62 |
| Partitioning Clustering | p. 64 |
| Graphs Clustering | p. 64 |
| Similarity Measure for Graphs | p. 65 |
| Hierarchical Clustering of Graphs | p. 69 |
| Conclusion | p. 71 |
| Conceptual Clustering | p. 71 |
| Introduction | p. 71 |
| Concept Hierarchy and Concept Description | p. 71 |
| Category Utility Function | p. 72 |
| Algorithmic Properties | p. 73 |
| Algorithm | p. 73 |
| Conceptual Clustering of Graphs | p. 75 |
| Notion of a Case and Similarity Measure | p. 75 |
| Evaluation Function | p. 75 |
| Prototype Learning | p. 76 |
| An Example of a Learned Concept Hierarchy | p. 76 |
| Conclusion | p. 79 |
| Evaluation of the Model | p. 79 |
| Error Rate, Correctness, and Quality | p. 79 |
| Sensitivity and Specifity | p. 81 |
| Test-and-Train | p. 82 |
| Random Sampling | p. 82 |
| Cross Validation | p. 82 |
| Conclusion | p. 83 |
| Feature Subset Selection | p. 83 |
| Introduction | p. 83 |
| Feature Subset Selection Algorithms | p. 83 |
| The Wrapper and the Filter Model for Feature Subset Selection | p. 84 |
| Feature Selection Done by Decision Tree Induction | p. 85 |
| Feature Subset Selection Done by Clustering | p. 86 |
| Contextual Merit Algorithm | p. 87 |
| Floating Search Method | p. 88 |
| Conclusion | p. 88 |
| Applications | p. 91 |
| Controlling the Parameters of an Algorithm/Model by Case-Based Reasoning | p. 91 |
| Modelling Concerns | p. 91 |
| Case-Based Reasoning Unit | p. 92 |
| Management of the Case Base | p. 93 |
| Case Structure and Case Base | p. 94 |
| Non-image Information | p. 95 |
| Image Information | p. 96 |
| Image Similarity Determination | p. 97 |
| Image Similarity Measure 1 (ISim_1) | p. 97 |
| Image Similarity Measure 2 (iSIM_2) | p. 98 |
| Comparision of ISim_1 and ISim_2 | p. 98 |
| Segmentation Algorithm and Segmentation Parameters | p. 99 |
| Similarity Determination | p. 100 |
| Overall Similarity | p. 100 |
| Similarity Measure for Non-image Information | p. 101 |
| Similarity Measure for Image Information | p. 101 |
| Knowledge Acquisition Aspect | p. 101 |
| Conclusion | p. 102 |
| Mining Images | p. 102 |
| Introduction | p. 102 |
| Preparing the Experiment | p. 103 |
| Image Mining Tool | p. 105 |
| The Application | p. 106 |
| Brainstorming and Image Catalogue | p. 107 |
| Interviewing Process | p. 107 |
| Setting Up the Automatic Image Analysis and Feature Extraction Procedure | p. 107 |
| Image Analysis | p. 108 |
| Feature Extraction | p. 109 |
| Collection of Image Descriptions into the Data Base | p. 111 |
| The Image Mining Experiment | p. 112 |
| Review | p. 113 |
| Using the Discovered Knowledge | p. 114 |
| Lessons Learned | p. 115 |
| Conclusions | p. 116 |
| Conclusion | p. 117 |
| Appendix | p. 119 |
| The IRIS Data Set | p. 119 |
| References | p. 121 |
| Index | p. 129 |
| Table of Contents provided by Publisher. All Rights Reserved. |