| Introduction to Data Quality | p. 1 |
| Why Data Quality is Relevant | p. 1 |
| Introduction to the Concept of Data Quality | p. 4 |
| Data Quality and Types of Data | p. 6 |
| Data Quality and Types of Information Systems | p. 9 |
| Main Research Issues and Application Domains in Data Quality | p. 11 |
| Research Issues in Data Quality | p. 12 |
| Application Domains in Data Quality | p. 12 |
| Research Areas Related to Data Quality | p. 16 |
| Summary | p. 17 |
| Data Quality Dimensions | p. 19 |
| Accuracy | p. 20 |
| Completeness | p. 23 |
| Completeness of Relational Data | p. 24 |
| Completeness of Web Data | p. 27 |
| Time-Related Dimensions: Currency, Timeliness, and Volatility | p. 28 |
| Consistency | p. 30 |
| Integrity Constraints | p. 30 |
| Data Edits | p. 31 |
| Other Data Quality Dimensions | p. 32 |
| Accessibility | p. 34 |
| Quality of Information Sources | p. 35 |
| Approaches to the Definition of Data Quality Dimensions | p. 36 |
| Theoretical Approach | p. 36 |
| Empirical Approach | p. 38 |
| Intuitive Approach | p. 39 |
| A Comparative Analysis of the Dimension Definitions | p. 39 |
| Trade-offs Between Dimensions | p. 40 |
| Schema Quality Dimensions | p. 42 |
| Readability | p. 45 |
| Normalization | p. 45 |
| Summary | p. 48 |
| Models for Data Quality | p. 51 |
| Introduction | p. 51 |
| Extensions of Structured Data Models | p. 52 |
| Conceptual Models | p. 52 |
| Logical Models for Data Description | p. 54 |
| The Polygen Model for Data Manipulation | p. 55 |
| Data Provenance | p. 56 |
| Extensions of Semistructured Data Models | p. 59 |
| Management Information System Models | p. 61 |
| Models for Process Description: the IP-MAP model | p. 61 |
| Extensions of IP-MAP | p. 62 |
| Data Models | p. 64 |
| Summary | p. 68 |
| Activities and Techniques for Data Quality: Generalities | p. 69 |
| Data Quality Activities | p. 70 |
| Quality Composition | p. 71 |
| Models and Assumptions | p. 74 |
| Dimensions | p. 76 |
| Accuracy | p. 78 |
| Completeness | p. 79 |
| Error Localization and Correction | p. 82 |
| Localize and Correct Inconsistencies | p. 82 |
| Incomplete Data | p. 85 |
| Discovering Outliers | p. 86 |
| Cost and Benefit Classifications | p. 88 |
| Cost Classifications | p. 89 |
| Benefits Classification | p. 94 |
| Summary | p. 95 |
| Object Identification | p. 97 |
| Historical Perspective | p. 98 |
| Object Identification for Different Data Types | p. 99 |
| The High-Level Process for Object Identification | p. 101 |
| Details on the Steps for Object Identification | p. 103 |
| Preprocessing | p. 103 |
| Search Space Reduction | p. 104 |
| Comparison Functions | p. 104 |
| Object Identification Techniques | p. 106 |
| Probabilistic Techniques | p. 106 |
| The Fellegi and Sunter Theory and Extensions | p. 107 |
| A Cost-Based Probabilistic Technique | p. 112 |
| Empirical Techniques | p. 113 |
| Sorted Neighborhood Method and Extensions | p. 113 |
| The Priority Queue Algorithm | p. 116 |
| A Technique for Complex Structured Data: Delphi | p. 117 |
| XML Duplicate Detection: DogmatiX | p. 119 |
| Other Empirical Methods | p. 120 |
| Knowledge-Based Techniques | p. 121 |
| A Rule-Based Approach: Intelliclean | p. 122 |
| Learning Methods for Decision Rules: Atlas | p. 123 |
| Comparison of Techniques | p. 125 |
| Metrics | p. 125 |
| Search Space Reduction Methods | p. 127 |
| Comparison Functions | p. 127 |
| Decision Methods | p. 128 |
| Results | p. 130 |
| Summary | p. 131 |
| Data Quality Issues in Data Integration Systems | p. 133 |
| Introduction | p. 133 |
| Generalities on Data Integration Systems | p. 134 |
| Query Processing | p. 135 |
| Techniques for Quality-Driven Query Processing | p. 137 |
| The QP-alg: Quality-Driven Query Planning | p. 138 |
| DaQuinCIS Query Processing | p. 140 |
| Fusionplex Query Processing | p. 141 |
| Comparison of Quality-Driven Query Processing Techniques | p. 143 |
| Instance-level Conflict Resolution | p. 143 |
| Classification of Instance-Level Conflicts | p. 144 |
| Overview of Techniques | p. 146 |
| Comparison of Instance-level Conflict Resolution Techniques | p. 156 |
| Inconsistencies in Data Integration: a Theoretical Perspective | p. 157 |
| A Formal Framework for Data Integration | p. 157 |
| The Problem of Inconsistency | p. 158 |
| Summary | p. 160 |
| Methodologies for Data Quality Measurement and Improvement | p. 161 |
| Basics on Data Quality Methodologies | p. 161 |
| Inputs and Outputs | p. 161 |
| Classification of Methodologies | p. 164 |
| Comparison among Data-driven and Process-driven Strategies | p. 164 |
| Assessment Methodologies | p. 167 |
| Comparative Analysis of General-purpose Methodologies | p. 170 |
| Basic Common Phases Among Methodologies | p. 171 |
| The TDQM Methodology | p. 172 |
| The TQdM Methodology | p. 174 |
| The Istat Methodology | p. 177 |
| Comparisons of Methodologies | p. 180 |
| The CDQM methodology | p. 181 |
| Reconstruct the State of Data | p. 182 |
| Reconstruct Business Processes | p. 183 |
| Reconstruct Macroprocesses and Rules | p. 183 |
| Check Problems with Users | p. 184 |
| Measure Data Quality | p. 184 |
| Set New Target DQ Levels | p. 185 |
| Choose Improvement Activities | p. 186 |
| Choose Techniques for Data Activities | p. 187 |
| Find Improvement Processes | p. 187 |
| Choose the Optimal Improvement Process | p. 188 |
| A Case Study in the e-Government Area | p. 188 |
| Summary | p. 199 |
| Tools for Data Quality | p. 201 |
| Introduction | p. 201 |
| Tools | p. 202 |
| Potter's Wheel | p. 203 |
| Telcordia's Tool | p. 205 |
| Ajax | p. 206 |
| Artkos | p. 208 |
| Choice Maker | p. 210 |
| Frameworks for Cooperative Information Systems | p. 212 |
| DaQuinCIS Framework | p. 212 |
| FusionPlex Framework | p. 215 |
| Toolboxes to Compare Tools | p. 216 |
| Theoretical Approach | p. 216 |
| Tailor | p. 217 |
| Summary | p. 218 |
| Open Problems | p. 221 |
| Dimensions and Metrics | p. 221 |
| Object Identification | p. 222 |
| XML Object Identification | p. 223 |
| Object Identification of Personal Information | p. 224 |
| Record Linkage and Privacy | p. 225 |
| Data Integration | p. 227 |
| Trust-Aware Query Processing in P2P Contexts | p. 227 |
| Cost-Driven Query Processing | p. 228 |
| Methodologies | p. 230 |
| Conclusions | p. 235 |
| References | p. 237 |
| Index | p. 249 |
| Table of Contents provided by Ingram. All Rights Reserved. |