| List of Figures | p. xi |
| List of Tables | p. xv |
| Preface | p. xvii |
| An Introduction to Data Streams | p. 1 |
| Introduction | p. 1 |
| Stream Mining Algorithms | p. 2 |
| Conclusions and Summary | p. 6 |
| References | p. 7 |
| On Clustering Massive Data Streams: A Summarization Paradigm | p. 9 |
| Introduction | p. 10 |
| The Micro-clustering Based Stream Mining Framework | p. 12 |
| Clustering Evolving Data Streams: A Micro-clustering Approach | p. 17 |
| Micro-clustering Challenges | p. 18 |
| Online Micro-cluster Maintenance: The CluStream Algorithm | p. 19 |
| High Dimensional Projected Stream Clustering | p. 22 |
| Classification of Data Streams: A Micro-clustering Approach | p. 23 |
| On-Demand Stream Classification | p. 24 |
| Other Applications of Micro-clustering and Research Directions | p. 26 |
| Performance Study and Experimental Results | p. 27 |
| Discussion | p. 36 |
| References | p. 36 |
| A Survey of Classification Methods in Data Streams | p. 39 |
| Introduction | p. 39 |
| Research Issues | p. 41 |
| Solution Approaches | p. 43 |
| Classification Techniques | p. 44 |
| Ensemble Based Classification | p. 45 |
| Very Fast Decision Trees (VFDT) | p. 46 |
| On Demand Classification | p. 48 |
| Online Information Network (OLIN) | p. 48 |
| LWClass Algorithm | p. 49 |
| ANNCAD Algorithm | p. 51 |
| SCALLOP Algorithm | p. 51 |
| Summary | p. 52 |
| References | p. 53 |
| Frequent Pattern Mining in Data Streams | p. 61 |
| Introduction | p. 61 |
| Overview | p. 62 |
| New Algorithm | p. 67 |
| Work on Other Related Problems | p. 79 |
| Conclusions and Future Directions | p. 80 |
| References | p. 81 |
| A Survey of Change Diagnosis Algorithms in Evolving Data Streams | p. 85 |
| Introduction | p. 86 |
| The Velocity Density Method | p. 88 |
| Spatial Velocity Profiles | p. 93 |
| Evolution Computations in High Dimensional Case | p. 95 |
| On the use of clustering for characterizing stream evolution | p. 96 |
| On the Effect of Evolution in Data Mining Algorithms | p. 97 |
| Conclusions | p. 100 |
| References | p. 101 |
| Multi-Dimensional Analysis of Data Streams Using Stream Cubes | p. 103 |
| Introduction | p. 104 |
| Problem Definition | p. 106 |
| Architecture for On-line Analysis of Data Streams | p. 108 |
| Tilted time frame | p. 108 |
| Critical layers | p. 110 |
| Partial materialization of stream cube | p. 111 |
| Stream Data Cube Computation | p. 112 |
| Algorithms for cube computation | p. 115 |
| Performance Study | p. 117 |
| Related Work | p. 120 |
| Possible Extensions | p. 121 |
| Conclusions | p. 122 |
| References | p. 123 |
| Load Shedding in Data Stream Systems | p. 127 |
| Load Shedding for Aggregation Queries | p. 128 |
| Problem Formulation | p. 129 |
| Load Shedding Algorithm | p. 133 |
| Extensions | p. 141 |
| Load Shedding in Aurora | p. 142 |
| Load Shedding for Sliding Window Joins | p. 144 |
| Load Shedding for Classification Queries | p. 145 |
| Summary | p. 146 |
| References | p. 146 |
| The Sliding-Window Computation Model and Results | p. 149 |
| Motivation and Road Map | p. 150 |
| A Solution to the BasicCounting Problem | p. 152 |
| The Approximation Scheme | p. 154 |
| Space Lower Bound for BasicCounting Problem | p. 157 |
| Beyond 0's and 1's | p. 158 |
| References and Related Work | p. 163 |
| Conclusion | p. 164 |
| References | p. 166 |
| A Survey of Synopsis Construction in Data Streams | p. 169 |
| Introduction | p. 169 |
| Sampling Methods | p. 172 |
| Random Sampling with a Reservoir | p. 174 |
| Concise Sampling | p. 176 |
| Wavelets | p. 177 |
| Recent Research on Wavelet Decomposition in Data Streams | p. 182 |
| Sketches | p. 184 |
| Fixed Window Sketches for Massive Time Series | p. 185 |
| Variable Window Sketches of Massive Time Series | p. 185 |
| Sketches and their applications in Data Streams | p. 186 |
| Sketches with p-stable distributions | p. 190 |
| The Count-Min Sketch | p. 191 |
| Related Counting Methods: Hash Functions for Determining Distinct Elements | p. 193 |
| Advantages and Limitations of Sketch Based Methods | p. 194 |
| Histograms | p. 196 |
| One Pass Construction of Equi-depth Histograms | p. 198 |
| Constructing V-Optimal Histograms | p. 198 |
| Wavelet Based Histograms for Query Answering | p. 199 |
| Sketch Based Methods for Multi-dimensional Histograms | p. 200 |
| Discussion and Challenges | p. 200 |
| References | p. 202 |
| A Survey of Join Processing in Data Streams | p. 209 |
| Introduction | p. 209 |
| Model and Semantics | p. 210 |
| State Management for Stream Joins | p. 213 |
| Exploiting Constraints | p. 214 |
| Exploiting Statistical Properties | p. 216 |
| Fundamental Algorithms for Stream Join Processing | p. 225 |
| Optimizing Stream Joins | p. 227 |
| Conclusion | p. 230 |
| Acknowledgments | p. 232 |
| References | p. 232 |
| Indexing and Querying Data Streams | p. 237 |
| Introduction | p. 238 |
| Indexing Streams | p. 239 |
| Preliminaries and definitions | p. 239 |
| Feature extraction | p. 240 |
| Index maintenance | p. 244 |
| Discrete Wavelet Transform | p. 246 |
| Querying Streams | p. 248 |
| Monitoring an aggregate query | p. 248 |
| Monitoring a pattern query | p. 251 |
| Monitoring a correlation query | p. 252 |
| Related Work | p. 254 |
| Future Directions | p. 255 |
| Distributed monitoring systems | p. 255 |
| Probabilistic modeling of sensor networks | p. 256 |
| Content distribution networks | p. 256 |
| Chapter Summary | p. 257 |
| References | p. 257 |
| Dimensionality Reduction and Forecasting on Streams | p. 261 |
| Related work | p. 264 |
| Principal component analysis (PCA) | p. 265 |
| Auto-regressive models and recursive least squares | p. 267 |
| Muscles | p. 269 |
| Tracking correlations and hidden variables: Spirit | p. 271 |
| Putting Spirit to work | p. 276 |
| Experimental case studies | p. 278 |
| Performance and accuracy | p. 283 |
| Conclusion | p. 286 |
| Acknowledgments | p. 286 |
| References | p. 287 |
| A Survey of Distributed Mining of Data Streams | p. 289 |
| Introduction | p. 289 |
| Outlier and Anomaly Detection | p. 291 |
| Clustering | p. 295 |
| Frequent itemset mining | p. 296 |
| Classification | p. 297 |
| Summarization | p. 298 |
| Mining Distributed Data Streams in Resource Constrained Environments | p. 299 |
| Systems Support | p. 300 |
| References | p. 304 |
| Algorithms for Distributed Data Stream Mining | p. 309 |
| Introduction | p. 310 |
| Motivation: Why Distributed Data Stream Mining? | p. 311 |
| Existing Distributed Data Stream Mining Algorithms | p. 312 |
| A local algorithm for distributed data stream mining | p. 315 |
| Local Algorithms: definition | p. 315 |
| Algorithm details | p. 316 |
| Experimental results | p. 318 |
| Modifications and extensions | p. 320 |
| Bayesian Network Learning from Distributed Data Streams | p. 321 |
| Distributed Bayesian Network Learning Algorithm | p. 322 |
| Selection of samples for transmission to global site | p. 323 |
| Online Distributed Bayesian Network Learning | p. 324 |
| Experimental Results | p. 326 |
| Conclusion | p. 326 |
| References | p. 329 |
| A Survey of Stream Processing Problems and Techniques in Sensor Networks | p. 333 |
| Challenges | p. 334 |
| The Data Collection Model | p. 335 |
| Data Communication | p. 335 |
| Query Processing | p. 337 |
| Aggregate Queries | p. 338 |
| Join Queries | p. 340 |
| Top-k Monitoring | p. 341 |
| Continuous Queries | p. 341 |
| Compression and Modeling | p. 342 |
| Data Distribution Modeling | p. 343 |
| Outlier Detection | p. 344 |
| Application: Tracking of Objects using Sensor Networks | p. 345 |
| Summary | p. 347 |
| References | p. 348 |
| Index | p. 353 |
| Table of Contents provided by Ingram. All Rights Reserved. |