Abstract
One of the major concerns when implementing a supervised artificial neural network solution to a classification or prediction problem, is the network’s performance on unseen data. The phenomenon of the network overfitting the training data, is understood and reported in the literature. Most researchers recommend a ‘trial and error’ approach to selecting the optimal number of weights for the network, which is time consuming, or start with a large network and prune to an optimal size. Current pruning techniques based on approximations of the Hessian matrix of the error surface are computationally intensive and prone to severe approximation errors if a suitable minimal training error has not been achieved. We propose a novel and simple design heuristic for a three layer multi-layer perceptron (MLP) based on an eigenvalue decomposition of the covariance matrix of the middle layer output. This technique identifies the neurons which are contributing to the redundancy of data through the network and as such are additional effective network parameters which have a deleterious effect on the classifier surface smoothness. This technique identifies redundancy in the network data and so is not dependant on the network training having reached a minimal error value making the Levenberg-Marquardt approximation valid. We report on simulations using the double-convex benchmark which show the utility of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
C. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.
C. Bishop. Regularization and complexity control in feedforward networks. In International Conference on Artificial Neural Networks, volume 1, pages 141–148, 1995.
B. Hassibi, D.G. Stork, and G. Wolff. Optimal brain surgeon and general network pruning. In IEEE International Conference on Neural Networks, volume 1, pages 293–299, 1992.
S. Haykin. Neural Networks: A Comprehensive Foundation. MacMillan Publishing, 1995.
J. Karhunen and J. Joutsensalo. Generalisations of principal component analysis, optimisation problems and neural networks. Neural Networks, 8(4):549–562, 1995.
Y. Le Cun, J.S. Denker, and S.A. Solla. Optimal brain damage. Advances in Neural Information Processing Systems, 2:598–605, 1990.
J.E. Moody. The effective number of parameters: An analysis of generalisation and regularisation in nonlinear learning systems. In Advances in Neural Informations Processing Systems, pages 847–854. Morgan Kauffmann, 1992.
J.E. Moody, A.U. Leen, and T.K. Leen. Fast pruning using principal components. In Advances In Neural Information Processing, volume 6. Morgan Kauffmann, 1994.
R. Shiavi. Introduction to Applied Statistical Signal Analysis. Aksen Associates Incorporated Publishers, Irwin, 1991.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Wien
About this paper
Cite this paper
Girolami, M. (1998). Principal Components Identify MLP Hidden Layer Size for Optimal Generalisation Performance. In: Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-6492-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-7091-6492-1_9
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-83087-1
Online ISBN: 978-3-7091-6492-1
eBook Packages: Springer Book Archive