Abstract
In this paper we explore prediction intervals and how they can be used for model evaluation and discrimination in the supervised regression setting of medium sized datasets. We review three different methods for making prediction intervals and the statistics used for their evaluation. How the prediction intervals look like, how different methods behave and how the prediction intervals can be utilized for the graphical evaluation of models is illustrated with the help of simple datasets. Afterwards we propose a combined method for making prediction intervals and explore its performance with two voting schemes for combining predictions of a diverse ensemble of models. All methods are tested on a large set of datasets on which we evaluate individual methods and aggregated variants for their abilities of selecting the best predictions. The analysis of correlations between the root mean squared error and our evaluation statistic show that both stability and reliability of the results increase as the techniques get more elaborate. We confirm that the methodology is suitable for the graphical comparison of individual models and is a viable way of discriminating among model candidates.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Bosnić Z, Kononenko I (2008) Comparison of approaches for estimating reliability of individual regression predictions. Data Knowl Eng 67(3):504–516
Bosnić Z, Kononenko I (2008) Estimation of individual prediction reliability using the local sensitivity analysis. Appl Intell 29(3):187–203
Breiman L (1996) Bagging predictors. Mach Learn 123–140
Breiman L (2001) Random forests. vol 45, pp 5–32
Dasarathy BV, Sheela BV (1979) A composite classifier system design: Concepts and methodology. vol 67, pp 708–713
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Hamada M, Johnson V, Moore LM, Wendelberger J (2004) Bayesian prediction intervals and their relationship to tolerance intervals. Technometrics 46(4):452–459
Heskes T (1997) Practical confidence and prediction intervals. Advances in Neural Information Processing Systems 9:176–182
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Statistical science, pp 382–401
Horn PS, Pesce AJ, Copeland BE (1998) A robust approach to reference interval estimation and evaluation. Clin Chem 44(3):622–631
Khosravi A, Nahavandi S, Creighton D (2013) Prediction Intervals for Short-Term Wind Farm Power Generation Forecasts. IEEE Transactions on Sustainable Energy 4(3):602–610
Lawless J, Fredette M (2005) Frequentist prediction intervals and predictive distributions. Biometrika 92(3):529–542
Li Y, Chen J, Feng L (2013) Dealing with uncertainty: A survey of theories and practices. IEEE Trans Knowl Data Eng 25(11):2463–2482
Lin Y, Jeon Y (2002) Random forests and adaptive nearest neighbours. J Am Stat Assoc 97(457):101–474
Meinshausen N (2006) Quantile regression forests. J Mach Learn Res 7:983–999
Monteith K, Carroll JL, Seppi K, Martinez T (2011) Turning bayesian model averaging into bayesian model combination. IEEE IJCNN 2011:2657–2663
Neyman J (1937) Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London Series A. Math Phys Sci 236:333–380
Nix D, Weigend A (1994) Estimating the mean and variance of the target probability distribution. IEEE World Congress on Computational Intelligence, 1994 IEEE International Conference on Neural Networks, pp 55–60
Oh S (2011) A new dataset evaluation method based on category overlap. Comp Bio Med 41(2):115–122
Papadopoulos H, Haralambous H (2011) Reliable prediction intervals with regression neural networks. Neural Netw 24(8):842–851
Pevec D, Kononenko I (2012) Model selection with combining valid and optimal prediction intervals. ICDM Workshops 653–658
Quan H, Srinivasan D, Khosravi A (2012) Uncertainty handling using neural network-based prediction intervals for electrical load forecasting. Energy 73:916–925
R Development Core Team (2006) R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria
Rodrigues PP, Gama J (2014) Distributed clustering of ubiquitous data streams. Wiley Interdiscip Rev Data Min Knowl Disc 4(1):38–54
Shrestha DL, Solomatine DP (2006) Machine learning approaches for estimation of prediction interval for the model output. Neural Netw 19(2):225–235
Tibshirani R (1996) A comparison of some error estimates for neural network models. Neural Comput 8(1):152–163
Zapranis A, Livanis E (2005) Prediction intervals for neural network models. Proceedings of the 9th WSEAS International Conference on Computers. ICCOMP’05 76:1–7
Zhao L, Wang L, Xu Q (2012) Data stream classification with artificial endocrine system. Appl Intell 37(3):390– 404
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pevec, D., Kononenko, I. Prediction intervals in supervised learning for model evaluation and discrimination. Appl Intell 42, 790–804 (2015). https://doi.org/10.1007/s10489-014-0632-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-014-0632-z