| Information Theory, Machine Learning, and Reproducing Kernel Hilbert Spaces | p. 1 |
| Introduction | p. 1 |
| Information Theory | p. 7 |
| Entropy | p. 10 |
| Mutual Information | p. 14 |
| Relative Entropy and Kullback-Leibler Divergence | p. 16 |
| Information Theory beyond Communications | p. 19 |
| Adaptive Model Building | p. 24 |
| Information-Theoretic Learning | p. 29 |
| ITL as a Unifying Learning Paradigm | p. 30 |
| Reproducing Kernel Hilbert Spaces | p. 34 |
| RKHS and ITL | p. 38 |
| Conclusions | p. 44 |
| Renyi's Entropy, Divergence and Their Nonparametric Estimators | p. 47 |
| Introduction | p. 47 |
| Definition and Interpretation of Renyi's Entropy | p. 48 |
| Quadratic Renyi's Entropy Estimator | p. 56 |
| Properties of Renyi's Nonparametric Entropy Estimators | p. 60 |
| Bias and Variance of the Information Potential Estimator | p. 70 |
| Physical Interpretation of Renyi's Entropy Kernel Estimators | p. 73 |
| Extension to -Information Potential with Arbitrary Kernels | p. 79 |
| Renyi's Divergence and Mutual Information | p. 80 |
| Quadratic Divergences and Mutual Information | p. 84 |
| Information Potentials and Forces in the Joint Space | p. 88 |
| Fast Computation of IP and CIP | p. 96 |
| Conclusion | p. 101 |
| Adaptive Information Filtering with Error Entropy and Error Correntropy Criteria | p. 103 |
| Introduction | p. 103 |
| The Error Entropy Criterion (EEC) for Adaptation | p. 104 |
| Understanding the Error Entropy Criterion | p. 105 |
| Minimum Error Entropy Algorithm | p. 111 |
| Analysis of MEE Performance Surface | p. 113 |
| Error Entropy, Correntropy, and M Estimation | p. 123 |
| Correntropy Induced Metric and M-Estimation | p. 128 |
| Normalized Information Potential as a Pseudometric | p. 131 |
| Adaptation of the Kernel Size in Adaptive Filtering | p. 135 |
| Conclusions | p. 139 |
| Algorithms for Entropy and Correntropy Adaptation with Applications to Linear Systems | p. 141 |
| Introduction | p. 141 |
| Recursive Information Potential for MEE (MEE-RIP) | p. 142 |
| Stochastic Information Gradient for MEE (MEE-SIG) | p. 146 |
| Self-Adjusting Stepsize for MEE (MEE-SAS) | p. 152 |
| Normalized MEE (NMEE) | p. 157 |
| Fixed-Point MEE (MEE-FP) | p. 162 |
| Fast Gauss Transform in MEE Adaptation | p. 167 |
| Incomplete Cholesky Decomposition for MEE | p. 170 |
| Linear Filter Adaptation with MSE, MEE and MCC | p. 171 |
| Conclusion | p. 177 |
| Nonlinear Adaptive Filtering with MEE, MCC, and Applications | p. 181 |
| Introduction | p. 181 |
| Backpropagation of Information Forces in MLP Training | p. 183 |
| Advanced Search Methods for Nonlinear Systems | p. 186 |
| ITL Advanced Search Algorithms | p. 190 |
| Application: Prediction of the Mackey-Glass Chaotic Time Series | p. 198 |
| Application: Nonlinear Channel Equalization | p. 204 |
| Error Correntropy Criterion (ECC) in Regression | p. 208 |
| Adaptive Kernel Size in System Identification and Tracking | p. 213 |
| Conclusions | p. 217 |
| Classification with EEC, Divergence Measures, and Error Bounds | p. 219 |
| Introduction | p. 219 |
| Brief Review of Classification | p. 220 |
| Error Entropy Criterion in Classification | p. 222 |
| Nonparametric Classifiers | p. 228 |
| Classification with Information Divergences | p. 231 |
| ITL Algorithms for Divergence and Mutual Information | p. 233 |
| Case Study: Automatic Target Recognition (ATR) with ITL | p. 237 |
| The Role of ITL Feature Extraction in Classification | p. 246 |
| Error Bounds for Classification | p. 253 |
| Conclusions | p. 260 |
| Clustering with ITL Principles | p. 263 |
| Introduction | p. 263 |
| Information-Theoretic Clustering | p. 264 |
| Differential Clustering Using Renyi's Entropy | p. 266 |
| The Clustering Evaluation Function | p. 267 |
| A Gradient Algorithm for Clustering with Dcs | p. 273 |
| Mean Shift Algorithms and Renyi's Entropy | p. 282 |
| Graph-Theoretic Clustering with ITL | p. 286 |
| Information Cut for Clustering | p. 292 |
| Conclusion | p. 297 |
| Self-Organizing ITL Principles for Unsupervised Learning | p. 299 |
| Introduction | p. 299 |
| Entropy and Cross-Entropy Optimization | p. 301 |
| The Information Maximization Principle | p. 304 |
| Exploiting Spatial Structure for Self-Organization | p. 307 |
| Principle of Redundancy Reduction | p. 308 |
| Independent Component Analysis (ICA) | p. 310 |
| The Information Bottleneck (IB) Method | p. 313 |
| The Principle Relevant Information (PRI) | p. 315 |
| Self-Organizing Principles with ITL Estimators | p. 329 |
| Conclusions | p. 349 |
| A Reproducing Kernel Hilbert Space Framework for ITL | p. 351 |
| Introduction | p. 351 |
| A RKHS Framework for ITL | p. 353 |
| ITL Cost Functions in the RKHS Framework | p. 358 |
| ITL Estimators in RKHS | p. 360 |
| Connection Between ITL and Kernel Methods via RKHS Hv | p. 364 |
| An ITL Perspective of MAP and SVM Classifiers | p. 368 |
| Case Study: RKHS for Computation with Spike Train | p. 376 |
| Conclusion | p. 383 |
| Correntropy for Random Variables: Properties and Applications in Statistical Inference | p. 385 |
| Introduction | p. 385 |
| Cross-Correntropy: Definitions and Properties | p. 386 |
| Centered Cross-Correntropy and Correntropy Coefficient | p. 396 |
| Parametric Cross-Correntropy and Measures of Dependence | p. 399 |
| Application: Matched Filtering | p. 402 |
| Application: Nonlinear Coupling Tests | p. 409 |
| Application: Statistical Dependence Tests | p. 410 |
| Conclusions | p. 412 |
| Correntropy for Random Processes: Properties and Applications in Signal Processing | p. 415 |
| Introduction | p. 415 |
| Autocorrentropy Function: Definition and Properties | p. 417 |
| Cross-Correntropy Function: Definition and Properties | p. 424 |
| Optimal Linear Filters in Hv | p. 425 |
| Correntropy MACE (CMACE) Filter in Hv | p. 427 |
| Application: Autocorrentropy Function as a Similarity Measure over Lags | p. 431 |
| Application: Karhunen-Loeve Transform in Hv | p. 438 |
| Application: Blind Source Separation | p. 442 |
| Application: CMACE for Automatic Target Recognition | p. 448 |
| Conclusion | p. 454 |
| A PDF Estimation Methods and Experimental Evaluation of ITL Descriptors | p. 457 |
| Introduction | p. 457 |
| Probability Density Function Estimation | p. 457 |
| Nonparametric Entropy Estimation | p. 472 |
| Estimation of Information-Theoretic Descriptors | p. 474 |
| Convolution Smoothing | p. 494 |
| Conclusions | p. 497 |
| Bibliography | p. 499 |
| Index | p. 517 |
| Table of Contents provided by Ingram. All Rights Reserved. |