| Introduction | |
| Inductive Databases and Constraint-based Data Mining: Introduction and Overview | p. 3 |
| Inductive Databases | p. 3 |
| Constraint-based Data Mining | p. 7 |
| Types of Constraints | p. 9 |
| Functions Used in Constraints | p. 12 |
| KDD Scenarios | p. 14 |
| A Brief Review of Literature Resources | p. 15 |
| The IQ (Inductive Queries for Mining Patterns and Models) Project | p. 17 |
| What's in this Book | p. 22 |
| Representing Entities in the OntoDM Data Mining Ontology | p. 27 |
| Introduction | p. 27 |
| Design Principles for the OntoDM ontology | p. 29 |
| OntoDM Structure and Implementation | p. 33 |
| Identification of Data Mining Entities | p. 38 |
| Representing Data Mining Enitities in OntoDM | |
| Related Work | p. 52 |
| Conclusion | p. 54 |
| A Practical Comparative Study Of Data Mining Query Languages | p. 59 |
| Introduction | p. 60 |
| Data Mining Tasks | p. 61 |
| Comparison of Data Mining Query Languages | p. 62 |
| Summary of the Results | p. 74 |
| Conclusions | p. 76 |
| A Theory of Inductive Query Answering | p. 79 |
| Introduction | p. 80 |
| Boolean Inductive Queries | p. 81 |
| Generalized Version Spaces | p. 88 |
| Query Decomposition | p. 90 |
| Normal Forms | p. 98 |
| Conclusions | p. 100 |
| Constraint-based Mining: Selected Techniques | |
| Generalizing Itemset Mining in a Constraint Programming Setting | p. 107 |
| Introduction | p. 107 |
| General Concepts | p. 109 |
| Specialized Approaches | p. 111 |
| A Generalized Algorithm | p. 114 |
| A Dedicated Solver | p. 116 |
| Using Constraint Programming Systems | p. 120 |
| Conclusions | p. 124 |
| From Local Patterns to Classification Models | p. 127 |
| Introduction | p. 127 |
| Preliminaries | p. 131 |
| Correlated Patterns | p. 132 |
| Finding Pattern Sets | p. 137 |
| Direct Predictions from Patterns | p. 142 |
| Integrated Pattern Mining | p. 146 |
| Conclusions | p. 152 |
| Constrained Predictive Clustering | p. 155 |
| Introduction | p. 155 |
| Predictive Clustering Trees | p. 156 |
| Constrained Predictive Clustering Trees and Constraint Types | p. 161 |
| A Search Space of (Predictive) Clustering Trees | p. 165 |
| Algorithms for Enforcing Constraints | p. 167 |
| Conclusion | p. 173 |
| Finding Segmentations of Sequences | p. 177 |
| Introduction | p. 177 |
| Efficient Algorithms for Segmentation | p. 182 |
| Dimensionality Reduction | p. 183 |
| Recurrent Models | p. 185 |
| Unimodal Segmentation | p. 188 |
| Rearranging the Input Data Points | p. 189 |
| Aggregate Segmentation | p. 190 |
| Evaluating the Quality of a Segmentation: Randomization | p. 191 |
| Model Selection by BIC and Cross-validation | p. 193 |
| Bursty Sequences | p. 193 |
| Conclusion | p. 194 |
| Mining Constrained Cross-Graph Cliques in Dynamic Networks | p. 199 |
| Introduction | p. 199 |
| Problem Setting | p. 201 |
| DATA-PEELER | p. 205 |
| Extracting ¿-Contiguous Closed 3-Sets | p. 208 |
| Constraining the Enumeration to Extract 3-Cliques | p. 212 |
| Experimental Results | p. 217 |
| Related Work | p. 224 |
| Conclusion | p. 226 |
| Probabilistic Inductive Querying Using ProbLog | p. 229 |
| Introduction | p. 229 |
| ProbLog: Probabilistic Prolog | p. 233 |
| Probabilistic Inference | p. 234 |
| Implementation | p. 238 |
| Probabilistic Explanation Based Learning | p. 243 |
| Local Pattern Mining | p. 245 |
| Theory Compression | p. 249 |
| Parameter Estimation | p. 252 |
| Application | p. 255 |
| Related Work in Statistical Relational Learning | p. 258 |
| Conclusions | p. 259 |
| Inductive Databases: Integration Approaches | |
| Inductive Querying with Virtual Mining Views | p. 265 |
| Introduction | p. 266 |
| The Mining Views Framework | p. 267 |
| An Illustrative Scenario | p. 277 |
| Conclusions and Future Work | p. 285 |
| SINDBAD and SiQL: Overview, Applications and Future Developments | p. 289 |
| Introduction | p. 289 |
| SiQL | p. 291 |
| Example Applications | p. 296 |
| A Web Service Interface for SINDBAD | p. 303 |
| Future Developments | p. 305 |
| Conclusion | p. 307 |
| Patterns on Queries | p. 311 |
| Introduction | p. 311 |
| Preliminaries | p. 313 |
| Frequent Item Set Mining | p. 319 |
| Transforming KRIMP | p. 323 |
| Comparing the two Approaches | p. 331 |
| Conclusions and Prospects for Further Research | p. 333 |
| Experiment Databases | p. 335 |
| Introduction | p. 336 |
| Motivation | p. 337 |
| Related Work | p. 341 |
| A Pilot Experiment Database | p. 343 |
| Learning from the Past | p. 350 |
| Conclusions | p. 358 |
| Applications | |
| Predicting Gene Function using Predictive Clustering Trees | p. 365 |
| Introduction | p. 366 |
| Related Work | p. 367 |
| Predictive Clustering Tree Approaches for HMC | p. 369 |
| Evaluation Measure | p. 374 |
| Datasets | p. 375 |
| Comparison of Clus-HMC/SC/HSC | p. 378 |
| Comparison of (Ensembles of) CLUS-HMC to State-of-the-art Methods | p. 380 |
| Conclusions | p. 384 |
| Analyzing Gene Expression Data with Predictive Clustering Trees | p. 389 |
| Introduction | p. 389 |
| Datasets | p. 391 |
| Predicting Multiple Clinical Parameters | p. 392 |
| Evaluating Gene Importance with Ensembles of PCTs | p. 394 |
| Constrained Clustering of Gene Expression Data | p. 397 |
| Clustering gene expression time series data | p. 400 |
| Conclusions | p. 403 |
| Using a Solver Over the String Pattern Domain to Analyze Gene Promoter Sequences | p. 407 |
| Introduction | p. 407 |
| A Promoter Sequence Analysis Scenario | p. 409 |
| The Marguerite Solver | p. 412 |
| Tuning the Extraction Parameters | p. 413 |
| An Objective Interestingness Measure | p. 415 |
| Execution of the Scenario | p. 418 |
| Conclusion | p. 422 |
| Inductive Queries for a Drug Designing Robot Scientist | p. 425 |
| Introduction | p. 425 |
| The Robot Scientist Eve | p. 427 |
| Representations of Molecular Data | p. 430 |
| Selecting Compounds for a Drug Screening Library | p. 444 |
| Active learning | p. 446 |
| Conclusions | p. 448 |
| Appendix | p. 452 |
| Author index | p. 455 |
| Table of Contents provided by Ingram. All Rights Reserved. |