Abstract
We propose a new algorithm for computing the maximum likelihood estimate of a nonparametric survival function for interval-censored data, by extending the recently-proposed constrained Newton method in a hierarchical fashion. The new algorithm makes use of the fact that a mixture distribution can be recursively written as a mixture of mixtures, and takes a divide-and-conquer approach to break down a large-scale constrained optimization problem into many small-scale ones, which can be solved rapidly. During the course of optimization, the new algorithm, which we call the hierarchical constrained Newton method, can efficiently reallocate the probability mass, both locally and globally, among potential support intervals. Its convergence is theoretically established based on an equilibrium analysis. Numerical study results suggest that the new algorithm is the best choice for data sets of any size and for solutions with any number of support intervals.


Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bogaerts, K., Lesaffre, E.: A new, fast algorithm to find the regions of possible support for bivariate interval-censored data. J. Comput. Graph. Stat. 13, 330–340 (2004)
Böhning, D.: A vertex-exchange-method in D-optimal design theory. Metrika 33, 337–347 (1986)
Böhning, D., Schlattmann, P., Dietz, E.: Interval censored data: A note on the nonparametric maximum likelihood estimator of the distribution function. Biometrika 83, 462–466 (1996)
Chen, L., Jha, P., Sirling, B., Sgaier, S.K., Daid, T., Kaul, R., Nagelkerke, N.: Sexual risk factors for HIV infection in early and advanced HIV epidemics in Sub-Saharan Africa: systematic overview of 68 epidemiological studies. PLoS ONE 2, e1001 (2007)
Dax, A.: The smallest point of a polytope. J. Optim. Theory Appl. 64, 429–432 (1990)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–22 (1977)
Dümbgen, L., Freitag-Wolf, S., Jongbloed, G.: Estimating a unimodal distribution from interval-censored data. J. Am. Stat. Assoc. 101, 1094–1106 (2006)
Gentleman, R., Vandal, A.C.: Computational algorithms for censored-data problems using intersection graphs. J. Comput. Graph. Stat. 10, 403–421 (2001)
Gentleman, R., Vandal, A.C.: Icens: NPMLE for censored and truncated data. R package version 1.18.0 (2009)
Groeneboom, P.: Nonparametric maximum likelihood estimators for interval censoring and deconvolution. Technical report 378, Department of Statistics, Stanford University (1991)
Groeneboom, P., Jongbloed, G., Wellner, J.A.: The support reduction algorithm for computing nonparametric function estimates in mixture models. Scand. J. Stat. 35, 385–399 (2008)
Groeneboom, P., Wellner, J.A.: Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser, Basel (1992)
Jongbloed, G.: The iterative convex minorant algorithm for nonparametric estimation. J. Comput. Graph. Stat. 7, 301–321 (1998)
Kumwenda, N.I., Hoover, D.R., Mofenson, L.M., Thigpen, M.C., Kafulafula, G., Li, Q., Mipando, L., Nkanaunena, K., Mebrahtu, T., Bulterys, M., Fowler, M.G., Taha, T.E.: Extended antiretroviral prophylaxis to reduce breast-milk HIV-1 transmission. N. Engl. J. Med. 359, 119–129 (2008)
Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. Prentice-Hall, New York (1974)
Lesperance, M.L., Kalbfleisch, J.D.: An algorithm for computing the nonparametric MLE of a mixing distribution. J. Am. Stat. Assoc. 87, 120–126 (1992)
Lindsay, B.G.: In: Mixture Models: Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute for Mathematical Statistics, Hayward (1995)
Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, New York (2008)
Maathuis, M.H.: Reduction algorithm for the NPMLE for the distribution of bivariate interval-censored data. J. Comput. Graph. Stat. 14, 352–362 (2005)
Maathuis, M.H.: MLEcens: Computation of the MLE for bivariate (interval) censored data. R package version 0.1-2. (2007)
Peto, R.: Experimental survival curves for interval-censored data. Appl. Stat. 22, 86–91 (1973)
Pilla, R.S., Lindsay, B.G.: Alternative EM methods for nonparametric finite mixture models. Biometrika 88, 535–550 (2001)
Siegfried, N., Clarke, M., Volmink, J.: Randomised controlled trials in Africa of HIV and AIDS: descriptive study and spatial distribution. BMJ 331, 742 (2005)
Sun, J.: The Statistical Analysis of Interval-censored Failure Time Data. Springer, Berlin (2006)
Turnbull, B.W.: Nonparametric estimation of a survivorship function with doubly censored data. J. Am. Stat. Assoc. 69, 169–173 (1974)
Turnbull, B.W.: The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc. B 38, 290–295 (1976)
Wang, Y.: On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J. R. Stat. Soc. B 69, 185–198 (2007)
Wang, Y.: Dimension-reduced nonparametric maximum likelihood computation for interval-censored data. Comput. Stat. Data Anal. 52, 2388–2402 (2008)
Wellner, J.A., Zhan, Y.: A hybrid algorithm for computation of the nonparametric maximum likelihood estimator from censored data. J. Am. Stat. Assoc. 92, 945–959 (1997)
Wong, G.Y., Yu, Q.: Generalized MLE of a joint distribution function with multivariate interval-censored data. J. Multivar. Anal. 69, 155–166 (1999)
Wu, C.F.: Some algorithmic aspects of the theory of optimal designs. Ann. Stat. 6, 1286–1301 (1978)
Acknowledgements
The authors thank the Editor, the Associate Editor and two reviewers for many constructive comments, and are grateful to Bruce Lindsay for helpful suggestions. This research was supported by a Marsden grant of the Royal Society of New Zealand (9145/3608546).
Author information
Authors and Affiliations
Corresponding author
Appendix: Linear regression over a simplex
Appendix: Linear regression over a simplex
Consider the constrained least squares problem:

for δ>0. It can be solved by the NNLS algorithm of Lawson and Hanson (1974), after a transformation suggested by Dax (1990). Letting y=x/δ and c=b/δ, it is apparent that the problem is equivalent to

which is further equivalent to

where P=A−(c,…,c). The solution to problem (22) can be found by solving the following least squares problem with only non-negativity constraints:

By relating the Karush-Kuhn-Tucker conditions for both problems, Dax established that if \({\tilde {\mathbf {y}}}\) solves problem (23), then \({\tilde {\mathbf {y}}}/{\tilde {\mathbf {y}}}^{{\top }}\mathbf {1}\) solves problem (22).
Problem (23) can be solved by the NNLS algorithm of Lawson and Hanson (1974).
Rights and permissions
About this article
Cite this article
Wang, Y., Taylor, S.M. Efficient computation of nonparametric survival functions via a hierarchical mixture formulation. Stat Comput 23, 713–725 (2013). https://doi.org/10.1007/s11222-012-9341-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-012-9341-9