This unique compendium elaborates the basic perceptions of data warehouses and data mining. The former part of the book covers concepts like introduction to data warehouses, the need for using such data warehouses and key terminologies used in this framework. The latter part of the book covers the data mining concepts and the data mining techniques used in various applications and also explains the machine learning techniques in detail with suitable examples wherever essential.
The book is written in simple English and is user-friendly. Each chapter is modeled with several sample scenarios and illustrations wherever necessary. The complete contents of each chapter include chapter technical content, summary, key points to remember, and few case studies for class group discussions and problem-solving.
This volume clearly benefits professionals, academicians, data analysts, machine-learning community, undergraduate and postgraduate students.
Contents:
- About the Authors
- Introduction:
- Thrust Components
- Operational Databases
- OnLine Analytical Processing (OLAP)
- OLTP (OnLine Transaction Processing)
- Summary
- Basics of Data Operations:
- Data Aggregation
- Data Completeness
- Data Compression and Data Conversion
- Data Fragmentation
- Data Flow Diagram
- Data Dictionary & Data Dimension
- Summary
- Data Warehousing Basic Concepts:
- Bridging the Knowledge Gap between Operational Databases and Data Warehouses
- Need for Data Warehouses
- Subject-Oriented
- Integrated
- Non-Volatile
- Time-Variant
- Data Marts
- Decision Support Systems
- Executive Information Systems
- Data Warehouse Tools
- Summary
- Important Project Related References
- Data Warehouse Architecture:
- Design and Construction of Data Warehouses
- Three-Tier Data Warehouse Architecture
- Back-end Tools and Utilities
- Metadata Repository
- Evaluation of Data Warehouse Toolkits
- Summary
- Designing Data Warehouses:
- Logical Design Goals
- Physical Design
- Summary
- Partitioning and Parallelism in Data Warehouses:
- Metrics for Partitioning
- Partitioning Methods
- Parallelism
- Strategies to Improve Data Warehouse Performance
- Summary
- Overview of Data Mining:
- Introduction
- Motivational aspect to Data Mining
- Data Mining Functionalities
- Summary
- Knowledge Representation and Knowledge Discovery:
- Understanding Data
- Knowledge Representation
- Knowledge Discovery Process
- Summary
- Data Mining Techniques:
- Introduction
- Key Data Mining Techniques
- Classical Techniques
- Next-Generation Techniques
- Summary
- Machine Learning Using Classification:
- Introduction to Machine Learning
- Types of Classification
- Outlier Analysis
- Statistical-Based Algorithms
- Bayesian Classifiers
- Distance-Based Algorithms
- Decision Tree-Based Algorithms
- Neural-Network-Based Algorithms
- Rule-Based Algorithms
- Support Vector Machines
- Machine Learning Using Clustering:
- Basic Definitions of Clustering
- Overview of Clustering Methods
- Partitioning Methods
- Hierarchical Methods
- Spectral Clustering
- Density-Based Methods
- The Single Link Method (SLINK)
- The Complete Link Method (CLINK)
- The Group Average Method
- Clustering Validation
- Grid-Based Methods
- Comparison of Different Algorithms
- Cluster Validation
- Summary
- Association Rules:
- Introduction
- Market Basket Analysis
- Apriori Algorithm
- PCY (Park-Chen-Yu) Algorithm
- Association Rule Mining
- Summary
- Index
Readership: Researchers, professionals, academics and graduate students in machine learning, databases and data mining.