Abstract
With uncorrelated Gaussian factors extended to mutually independent factors beyond Gaussian, the conventional factor analysis is extended to what is recently called independent factor analysis. Typically, it is called binary factor analysis (BFA) when the factors are binary and called non-Gaussian factor analysis (NFA) when the factors are from real non-Gaussian distributions. A crucial issue in both BFA and NFA is the determination of the number of factors. In the literature of statistics, there are a number of model selection criteria that can be used for this purpose. Also, the Bayesian Ying-Yang (BYY) harmony learning provides a new principle for this purpose. This paper further investigates BYY harmony learning in comparison with existing typical criteria, including Akaik’s information criterion (AIC), the consistent Akaike’s information criterion (CAIC), the Bayesian inference criterion (BIC), and the cross-validation (CV) criterion on selection of the number of factors. This comparative study is made via experiments on the data sets with different sample sizes, data space dimensions, noise variances, and hidden factors numbers. Experiments have shown that for both BFA and NFA, in most cases BIC outperforms AIC, CAIC, and CV while the BYY criterion is either comparable with or better than BIC. In consideration of the fact that the selection by these criteria has to be implemented at the second stage based on a set of candidate models which have to be obtained at the first stage of parameter learning, while BYY harmony learning can provide not only a new class of criteria implemented in a similar way but also a new family of algorithms that perform parameter learning at the first stage with automated model selection, BYY harmony learning is more preferred since computing costs can be saved significantly.
Similar content being viewed by others
References
Akaike, H.: A new look at statistical model identification, IEEE Trans. Automat. Contr. 19 (1974), 716–723.
Akaike, H.: Factor analysis and AIC, Psychometrika 52(3) (1987), 317–332.
Anderson, T. W. and Rubin, H.: Statistical inference in factor analysis, Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Vol. 5, Berkeley, 1956, pp. 111–150.
Attias, H.: Independent factor analysis, Neur. Comput. 11 (1999), 803–851.
Bartholomew, D. J. and Knott, M.: Latent variable models and factor analysis, Kendall’s Library of Satistics, Vol. 7, Oxford University Press, New York, 1999.
Barron, A. and Rissanen, J.: The minimum description length principle in coding and modeling, IEEE Trans. Inf. Theory 44 (1998), 2743–2760.
Belouchrani, A. and Cardoso, J.: Maximum likelihood source separation by the expectation-maximization technique: deterministic and stochastic implementation, Proc. NOLTA95 (1995), 49–53.
Bertin, E. and Arnouts, S.: SExtractor: Software for source extraction, Astron. Astrophys., Suppl. Ser. 117 (1996).
Bourlard, H. and Kamp, Y.: Auto-association by multilayer perceptrons and sigular value decomposition, Biol. Cybern. 59 (1988), 291–294.
Bozdogan, H.: Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions, Psychometrika 52(3) (1987), 345–370.
Cattell, R.: The scree test for the number of factors, Multivariate Behav. Res. 1 (1966), 245–276.
Cichocki, A. and Amari, S. I.: Adaptive Blind Signal and Image Processing, Wiley, New York, 2002.
Dayan, P. and Zemel, R. S.: Competition and multiple cause models, Neural. Comput. 7 (1995), 565–579.
Heinen, T.: Latent Class and Discrete Latent Trait Models: Similarities and Differences, Sage, Thousand Oaks, CA, 1996.
Hyvarinen, A.: Independent component analysis in the presence of Gaussian noise by maximizing joint likelihood, Neurocomputing 22 (1998), 49–67.
Hyvarinen, A., Karhunen, J. and Oja, E.: Independent Component Analysis, Wiley, New York, 2001.
Kaiser, H.: A second generation little jiffy, Psychometrika 35 (1970), 401–415.
Liu, Z. Y., Chiu, K. C. and Xu, L.: Investigations on non-Gaussian factor analysis, IEEE Signal Process. Lett. 11(7) (2004), 597–600.
Moulines, E., Cardoso, J. and Gassiat, E.: Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models, Proc. ICASSP97 (1997), 3617–3620.
Rissanen, J.: Modeling by shortest data description, Automatica 14 (1978), 465–471.
Rubin, D. and Thayer, D.: EM algorithms for ML factor analysis, Psychometrika 47(1) (1982), 69–76.
Saund, E.: A multiple cause mixture model for unsupervised learning, Neural Comput. 7 (1995), 51–71.
Schwarz, G.: Estimating the dimension of a model, Ann. Stat. 6(2) (1978), 461–464.
Sclove, S. L.: Some aspects of model-selection criteria, Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, Vol. 2 Kluwer, Dordrecht, The Netherlands, 1994, pp. 37–67.
Stone, M.: Use of cross-validation for the choice and assessment of a prediction function, Journal R. Stat. Soc., B 36 (1974), 111–147.
Treier, S. and Jackman, S.: Beyond factor analysis: modern tools for social measurement, Presented at the 2002 Annual Meetings of the Western Political Science Association and the Midwest Political Science Association, 2002.
Xu, L.: Least mean square error reconstruction for self-organizing neural-nets, Neural Netw. 6 (1993), 627–648. Its early version on Proc. IJCNN91’Singapore (1991), 2363–2373.
Xu, L.: Bayesian-Kullback coupled Ying-Yang machines: Unified learnings and new results on vector quantization, Proc. Intl. Conf. on Neural Information Processing (ICONIP95), Beijing, China, 1995, pp. 977–988.
Xu, L.: Bayesian Ying-Yang system and theory as a unified statistical learning approach (III): Models and algorithms for dependence reduction, data dimension reduction, ICA and supervised learning, in K. M. Wong, et al. (eds): Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer, 1997, pp. 43–60.
Xu, L.: Bayesian Kullback Ying-Yang dependence reduction theory, Neurocomputing, 22 (1–3) (1998), 81–112.
Xu, L.: Temporal BYY learning for state space approach, hidden Markov model and blind source separation, IEEE Trans Signal Process. 48 (2000), 2132–2144.
Xu, L.: BYY harmony learning, independent state space, and generalized APT financial analyses, IEEE Trans. Neural Netw. 12(4) (2001), 822–849.
Xu, L.: BYY harmony learning, structural RPCL, and topological self-organizing on mixture models, Neural Netw. 15 (2002), 1125–1151.
Xu, L.: Independent component analysis and extensions with noise and time: A Bayesian Ying-Yang learning perspective, Neural Inf. Process. Lett. Rev. 1(1) (2003), 1–52.
Xu, L.: BYY learning, regularized implementation, and model selection on modular networks with one hidden layer of binary units, Neurocomputing 51 (2003), 277–301.
Xu, L.: Advances on BYY harmony learning: Information theoretic perspective, generalized projection geometry, and independent factor autodetermination, IEEE Trans. Neural Netw. 15(4) (2004), 885–902.
Xu, L., Yang, H. H. and Amari, S. I.: Signal source separation by mixtures: accumulative distribution functions or mixture of bell-shape density distribution functions. Presentation at FRONTIER FORUM. Japan: Institute of Physical and Chemical Research, April, 1996.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
An, Y., Hu, X. & Xu, L. A Comparative Investigation on Model Selection in Independent Factor Analysis. J Math Model Algor 5, 447–473 (2006). https://doi.org/10.1007/s10852-005-9021-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10852-005-9021-2
Key words
- BYY harmony learning
- hidden factor
- binary factor analysis
- non-Gaussian factor analysis
- model selection
- automatic model selection