My Wish Lists Login / Join

Neural Networks for Conditional Probability Estimation : Forecasting Beyond Point Predictions - Dirk Husmeier

Instant online reading.
Don't wait for delivery!

Neural Networks for Conditional Probability Estimation

Forecasting Beyond Point Predictions

By: Dirk Husmeier

Write A Review

Paperback | 22 February 1999

At a Glance

Paperback
302 Pages

Dimensions(cm)
24.13 x 15.88 x 1.91

Paperback

$84.99

or 4 interest-free payments of $21.25 with

Ships in 5 to 7 business days

Conventional applications of neural networks usually predict a single value as a function of given inputs. In forecasting, for example, a standard objective is to predict the future value of some entity of interest on the basis of a time series of past measurements or observations. Typical training schemes aim to minimise the sum of squared deviations between predicted and actual values (the 'targets'), by which, ideally, the network learns the conditional mean of the target given the input. If the underlying conditional distribution is Gaus sian or at least unimodal, this may be a satisfactory approach. However, for a multimodal distribution, the conditional mean does not capture the relevant features of the system, and the prediction performance will, in general, be very poor. This calls for a more powerful and sophisticated model, which can learn the whole conditional probability distribution. Chapter 1 demonstrates that even for a deterministic system and 'be nign' Gaussian observational noise, the conditional distribution of a future observation, conditional on a set of past observations, can become strongly skewed and multimodal. In Chapter 2, a general neural network structure for modelling conditional probability densities is derived, and it is shown that a universal approximator for this extended task requires at least two hidden layers. A training scheme is developed from a maximum likelihood approach in Chapter 3, and the performance ofthis method is demonstrated on three stochastic time series in chapters 4 and 5.

Shipping

	Standard Shipping	Express Shipping
Metro postcodes:	$9.99	$14.95
Regional postcodes:	$9.99	$14.95
Rural postcodes:	$9.99	$14.95

Orders over $79.00 qualify for free shipping.

How to return your order

At Booktopia, we offer hassle-free returns in accordance with our returns policy. If you wish to return an item, please get in touch with Booktopia Customer Care.

Additional postage charges may be applicable.

Defective items

If there is a problem with any of the items received for your order then the Booktopia Customer Care team is ready to assist you.

For more info please visit our Help Centre.

You Can Find This Book In

Non-Fiction Computing & I.T.Computer Science Artificial Intelligence Neural Networks & Fuzzy Systems Society & Culture Social Issues & Processes Social Forecasting Pattern Recognition Computer Vision

List of Figures	p. xxi
Introduction	p. 1
Conventional forecasting and Takens' embedding theorem	p. 1
Implications of observational noise	p. 5
Implications of dynamic noise	p. 9
Example	p. 10
Conclusion	p. 16
Objective of this book	p. 16
A Universal Approximator Network for Predicting Conditional Probability Densities	p. 21
Introduction	p. 21
A single-hidden-layer network	p. 22
An additional hidden layer	p. 23
Regaining the conditional probability density	p. 25
Moments of the conditional probability density	p. 26
Interpretation of the network parameters	p. 28
Gaussian mixture model	p. 29
Derivative-of-sigmoid versus Gaussian mixture model	p. 30
Comparison with other approaches	p. 31
Predicting local error bars	p. 31
Indirect method	p. 31
Complete kernel expansion: Conditional Density Estimation Network (CDEN) and Mixture Density Network (MDN)	p. 32
Distorted Probability Mixture Network (DPMN)	p. 32
Mixture of Experts (ME) and Hierarchical Mixture of Experts (HME)	p. 33
Soft histogram	p. 33
Summary	p. 34
Appendix: The moment generating function for the DSM network	p. 35
A Maximum Likelihood Training Scheme	p. 39
The cost function	p. 39
A gradient-descent training scheme	p. 43
Output weights	p. 45
Kernel widths	p. 47
Remaining weights	p. 48
Interpretation of the parameter adaptation rules	p. 49
Deficiencies of gradient descent and their remedy	p. 51
Summary	p. 54
Appendix	p. 55
Benchmark Problems	p. 57
Logistic map with intrinsic noise	p. 57
Stochastic combination of two stochastic dynamical systems	p. 60
Brownian motion in a double-well potential	p. 63
Summary	p. 67
Demonstration of the Model Performance on the Benchmark Problems	p. 69
Introduction	p. 69
Logistic map with intrinsic noise	p. 71
Method	p. 71
Results	p. 73
Stochastic coupling between two stochastic dynamical systems	p. 75
Method	p. 75
Results	p. 77
Auto-pruning	p. 78
Brownian motion in a double-well potential	p. 80
Method	p. 80
Results	p. 82
Comparison with other approaches	p. 82
Conclusions	p. 83
Discussion	p. 84
Random Vector Functional Link (RVFL) Networks	p. 87
The RVFL theorem	p. 87
Proof of the RVFL theorem	p. 89
Comparison with the multilayer perceptron	p. 93
A simple illustration	p. 95
Summary	p. 96
Improved Training Scheme Combining the Expectation Maximisation (EM) Algorithm with the RVFL Approach	p. 99
Review of the Expectation Maximisation (EM) algorithm	p. 99
Simulation: Application of the GM network trained with the EM algorithm	p. 104
Method	p. 104
Results	p. 105
Discussion	p. 108
Combining EM and RVFL	p. 109
Preventing numerical instability	p. 112
Regularisation	p. 117
Summary	p. 118
Appendix	p. 118
Empirical Demonstration: Combining EM and RVFL	p. 121
Method	p. 121
Application of the GM-RVFL network to predicting the stochastic logistic-kappa map	p. 122
Training a single model	p. 122
Training an ensemble of models	p. 126
Application of the GM-RVFL network to the double-well problem	p. 129
Committee selection	p. 130
Prediction	p. 131
Comparison with other approaches	p. 132
Discussion	p. 134
A simple Bayesian regularisation scheme	p. 137
A Bayesian approach to regularisation	p. 137
A simple example: repeated coin flips	p. 139
A conjugate prior	p. 140
EM algorithm with regularisation	p. 142
The posterior mode	p. 143
Discussion	p. 145
The Bayesian Evidence Scheme for Regularisation	p. 147
Introduction	p. 147
A simple illustration of the evidence idea	p. 150
Overview of the evidence scheme	p. 152
First step: Gaussian approximation to the probability in parameter space	p. 152
Second step: Optimising the hyperparameters	p. 153
A self-consistent iteration scheme	p. 154
Implementation of the evidence scheme	p. 155
First step: Gaussian approximation to the probability in parameter space	p. 156
Second step: Optimising the hyperparameters	p. 157
Algorithm	p. 159
Discussion	p. 160
Improvement over the maximum likelihood estimate	p. 160
Justification of the approximations	p. 161
Final remark	p. 162
The Bayesian Evidence Scheme for Model Selection	p. 165
The evidence for the model	p. 165
An uninformative prior	p. 168
Comparison with MacKay's work	p. 171
Interpretation of the model evidence	p. 172
Ockham factors for the weight groups	p. 173
Ockham factors for the kernel widths	p. 174
Ockham factor for the priors	p. 175
Discussion	p. 176
Demonstration of the Bayesian Evidence Scheme for Regularisation	p. 179
Method and objective	p. 179
Initialisation	p. 179
Different training and regularisation schemes	p. 180
Pruning	p. 181
Large Data Set	p. 181
Small Data Set	p. 183
Number of well-determined parameters and pruning	p. 185
Automatic self-pruning	p. 185
Mathematical elucidation of the pruning scheme	p. 189
Summary and Conclusion	p. 191
Network Committees and Weighting Schemes	p. 193
Network committees for interpolation	p. 193
Network committees for modelling conditional probability densities	p. 196
Weighting Schemes for Predictors	p. 198
Introduction	p. 198
A Bayesian approach	p. 199
Numerical problems with the model evidence	p. 199
A weighting scheme based on the cross-validation performance	p. 201
Demonstration: Committees of Networks Trained with Different Regularisation Schemes	p. 203
Method and objective	p. 203
Single-model prediction	p. 204
Committee prediction	p. 207
Best and average single-model performance	p. 207
Improvement over the average single-model performance	p. 209
Improvement over the best single-model performance	p. 210
Robustness of the committee performance	p. 210
Dependence on the temperature	p. 211
Dependence on the temperature when including biased models	p. 212
Optimal temperature	p. 213
Model selection and evidence	p. 213
Advantage of under-regularisation and over-fitting	p. 215
Conclusions	p. 215
Automatic Relevance Determination (ARD)	p. 221
Introduction	p. 221
Two alternative ARD schemes	p. 223
Mathematical implementation	p. 224
Empirical demonstration	p. 227
A Real-World Application: The Boston Housing Data	p. 229
A real-world regression problem: The Boston house-price data	p. 230
Prediction with a single model	p. 231
Methodology	p. 231
Results	p. 232
Test of the ARD scheme	p. 234
Methodology	p. 234
Results	p. 234
Prediction with network committees	p. 236
Objective	p. 236
Methodology	p. 237
Weighting scheme and temperature	p. 238
ARD parameters	p. 239
Comparison between the two ARD schemes	p. 240
Number of kernels	p. 240
Bayesian regularisation	p. 241
Network complexity	p. 241
Cross-validation	p. 242
Discussion: How overfitting can be useful	p. 242
Increasing diversity	p. 244
Bagging	p. 245
Nonlinear Preprocessing	p. 246
Comparison with Neal's results	p. 248
Conclusions	p. 249
Summary	p. 251
Appendix: Derivation of the Hessian for the Bayesian Evidence Scheme	p. 255
Introduction and notation	p. 255
A decomposition of the Hessian using EM	p. 256
Explicit calculation of the Hessian	p. 258
Discussion	p. 265
References	p. 267
Index	p. 273
Table of Contents provided by Syndetics. All Rights Reserved.

Neural Networks for Conditional Probability Estimation

Forecasting Beyond Point Predictions

At a Glance

Paperback

Shipping

How to return your order

Defective items

You Can Find This Book In

More in Neural Networks & Fuzzy Systems

Feature Selection and Feature Extraction on Omics Data

Principles of Computational Neuroscience

Future-Ready with Generative AI

Skills, Mindsets, and Stories in the Age of AI

Future-Ready with Generative AI

Skills, Mindsets, and Stories in the Age of AI

Practical Machine Learning for Computer Vision

End-to-End Machine Learning for Images

Effective Machine Learning Teams

Best Practices for ML Practitioners

SECRETS OF MACHINE LEARNING

How It Works and What It Means for You

Autonomous Cyber Resilience

Deep Learning

Principles and Implementations

Decision-Making and Analytics

Data-Driven Business Intelligence and Optimization

Quantum Learning

Bridging Artificial Intelligence, Quantum Computing, and Data Science in Education

Applications of Quantum Field Theory to Problems in Machine Learning

Advanced Techniques Based on Path Integrals

Pyramidal Neural Networks

Psychology Revivals

Perceptive Machines

The Future of Feeling AI and What It Means for Humanity

Intelligent Mobile and IoT Ecosystems

Bridging Cloud, Fog, Edge, and AI

Quantum Artificial Intelligence

Technologies, Algorithms and Programming

Generative AI in Higher Education: Ethical Governance, Skills, and Employability

Volume 2

Technological Innovation and Sustainability through AI

Cutting-Edge Solutions for a Sustainable Future

Reinforcement Learning Explained

A Practical Problem-Solving Approach

Intelligent Estimation

A Soft Computing Paradigm

Artificial Intelligence in Tribology

Mathematical Foundations of Deep Learning

Theory and Algorithms

The Conscious Code

From Artificial Intelligence to Artificial Consciousness

Metaverse, Generative AI, and Brainâ"Computer Interfaces

Shaping the Future of the Digital Landscape

This product is categorised by