+612 9045 4394
Data Mining & Statistical Analysis Using SQL : APRESSPOD - Robert P. Trueblood

Data Mining & Statistical Analysis Using SQL


Paperback Published: 18th September 2001
ISBN: 9781893115545
Number Of Pages: 250

Share This Book:
Ships in 7 to 10 business days

This book is not just another theoretical text on statistics or data mining. Instead, it's designed for DBAs, database administrators, who want to buttress their understanding of statistics to support data mining and customer relationship management analytics and who want to use SQL, Structured Query Language. Each chapter is independent and self-contained with examples tailored to business applications. Each analysis technique is expressed in a mathematical format that lends itself to coding either as a database query or as a Visual Basic procedure using SQL.Each chapter includes: formulas (how to perform the required analysis, numerical example using data from a database, data visualization and presentation options (graphs, charts, tables), SQL procedures for extracting the desired results, and data mining techniques.About the authors:Robert P. Trueblood is an analyst at QuantiTech, Inc. He has a Ph.D. in computer science and applications and taught at the university level for 18 years before going into private industry. He enjoys designing and implementing systems in Visual Basic.John N. Lovett, Jr. is a senior engineering consultant at QuantiTech, Inc. and co-owner, with his anthropologist/archaeologist wife Jane, of Falls Mill and Museum in Belvedere, TN. He has a Ph.D. in industrial engineering, with a BS in mathematics and MS in operations research.

Dedicationp. iii
About the Authorsp. iv
Introductionp. xi
Acknowledgmentsp. xvi
Basic Statistical Principles and Diagnostic Treep. 1
Categories of Datap. 2
Sampling Methodsp. 2
Diagnostic Treep. 5
SQL Data Extraction Examplesp. 7
Measures of Central Tendency and Dispersionp. 9
Measures of Central Tendencyp. 10
Meanp. 10
Medianp. 12
Modep. 16
Geometric Meanp. 17
Weighted Meanp. 20
Measures of Dispersionp. 22
Histogram Constructionp. 22
Rangep. 32
Standard Deviationp. 33
Conclusionp. 40
Goodness of Fitp. 41
Tests of Hypothesisp. 43
Goodness of Fit Testp. 46
Fitting a Normal Distribution to Observed Datap. 47
Fitting a Poisson Distribution to Observed Datap. 62
Fitting an Exponential Distribution to Observed Datap. 68
Conclusionp. 71
T-SQL Source Codep. 72
Make_Intervalsp. 73
Combine_Intervalsp. 74
Compare_Observed_And_Expectedp. 80
Procedure Callsp. 82
Additional Tests of Hypothesisp. 85
Comparing a Single Mean to a Specified Valuep. 88
Comparing Means and Variances of Two Samplesp. 94
Comparisons of More Than Two Samplesp. 101
Conclusionp. 104
T-SQL Source Codep. 105
Calculate_T_Statisticp. 105
Calculate_Z_Statisticp. 107
Compare_Means_2_Samplesp. 108
Contingency_Testp. 113
Procedure Callsp. 117
Curve Fittingp. 119
Linear Regression in Two Variablesp. 121
Linear Correlation in Two Variablesp. 127
Polynomial Regression in Two Variablesp. 130
Other Nonlinear Regression Modelsp. 136
Linear Regression in More Than Two Variablesp. 141
Conclusionp. 147
T-SQL Source Codep. 147
Linear_Regression_2_Variablesp. 148
Gaussian_Eliminationp. 150
Array_2Dp. 158
Polynomial_Regressionp. 160
Exponential_Modelp. 169
Multiple_Linear_Regressionp. 172
Procedure Callsp. 179
Control Chartingp. 181
Common and Special Causes of Variationp. 183
Dissecting the Control Chartp. 193
Control Chorts for Sample Range and Mean Valuesp. 195
Control Chart for Fraction Nonconformingp. 206
Control Chart for Number of Nonconformitiesp. 213
Conclusionp. 215
T-SQL Source Codep. 216
Sample_Range_and_Mean_Chartsp. 216
Standard_P_Chartp. 219
Stabilized_P_Chartp. 222
C_Chartp. 224
Procedure Callsp. 227
Analysis of Experimental Designsp. 229
One-Way ANOVAp. 231
Two-Way ANOVAp. 238
ANOVA Involving Three Factorsp. 245
Conclusionp. 260
T-SQL Source Codep. 261
ANOVAp. 261
Procedure Callsp. 275
Time Series Analysisp. 277
Simple Moving Averagep. 278
Single Exponential Smoothingp. 286
Double Exponential Smoothingp. 292
Incorporating Seasonal Influencesp. 300
Criteria for Selecting the Most Appropriate Forecasting Techniquep. 308
Conclusionp. 311
T-SQL Source Codep. 311
Simple Moving Averagep. 312
Weighted Moving Averagep. 315
Single Exponential Smoothingp. 318
Double Exponential Smoothingp. 322
Seasonal Adjustmentp. 327
Procedure Callsp. 332
Overview of Relational Database Structure and SQLp. 337
Statistical Tablesp. 359
Tables of Statistical Distributions and Their Characteristicsp. 373
Visual Basic Routinesp. 381
Bibliographyp. 397
Indexp. 399
Table of Contents provided by Syndetics. All Rights Reserved.

ISBN: 9781893115545
ISBN-10: 1893115542
Audience: General
Format: Paperback
Language: English
Number Of Pages: 250
Published: 18th September 2001
Publisher: Apress
Country of Publication: US
Dimensions (cm): 23.65 x 19.0  x 2.41
Weight (kg): 0.77