Abstract
A plug-in estimator is proposed for a local measure of variance explained by regression, termed correlation curve in Doksum et al. (J Am Stat Assoc 89:571–582, 1994), consisting of a two-step spline–kernel estimator of the conditional variance function and local quadratic estimator of first derivative of the mean function. The estimator is oracally efficient in the sense that it is as efficient as an infeasible correlation estimator with the variance function known. As a consequence of the oracle efficiency, a smooth simultaneous confidence band (SCB) is constructed around the proposed correlation curve estimator and shown to be asymptotically correct. Simulated examples illustrate the versatility of the proposed oracle SCB which confirms the asymptotic theory. Application to a 1995 British Family Expenditure Survey data has found marginally significant evidence for a local version of Engel’s law, i.e., food budget share and household real income are inversely related (Hamilton in Am Econ Rev 91:619–630, 2001).


Similar content being viewed by others
References
Aerts M, Claeskens G (1997) Local polynomial estimators in multiparameter likelihood models. J Am Stat Assoc 92:1536–1545
Bickel PJ, Rosenblatt M (1973) On some global measures of deviations of density function estimates. Ann Stat 31:1852–1884
Blundell R, Chen X, Kristensen D (2007) Semi-nonparametric IV estimation of shape invariant Engel curves. Econometrica 75:1613–1670
Cao G, Wang L, Li Y, Yang L (2016) Oracle-efficient confidence envelopes for covariance functions in dense functional data. Stat Sin 26:359–383
Cao G, Yang L, Todem D (2012) Simultaneous inference for the mean function based on dense functional data. J Nonparametr Stat 24:359–377
Cai L, Yang L (2015) A smooth confidence band for variance function. TEST 24:632–655
Carroll RJ, Ruppert D (1988) Transformations and weighting in regression. Champman and Hall, London
Claeskens G, Van Keilegom I (2003) Bootstrap confidence bands for regression curves and their derivatives. Ann Stat 31:1852–1884
Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74:829–836
de Boor C (2001) A practical guide to splines. Springer, New York
Doksum K, Blyth S, Bradlow E, Meng X, Zhao H (1994) Correlation curves as local measures of variance explained by regression. J Am Stat Assoc 89:571–582
Eubank RL, Speckman PL (1993) Confidence bands in nonparametric regression. J Am Stat Assoc 88:1287–1301
Fan J (1993) Local linear regression smoothers and their minimax efficiency. Ann Stat 21:196–216
Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Champman and Hall, London
Fan J, Yao Q (1998) Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85:645–660
Gasser T, Müller HG, Mammitzsch V (1984) Estimating regression functions and their derivatives with kernel methods. Scand J stat 11:171–185
Gu L, Wang L, Härdle W, Yang L (2014) A simultaneous confidence corridor for varying coefficient regression with sparse functional data. TEST 23:806–843
Gu L, Yang L (2015) Oracally efficient estimation for single-index link function with simultaneous band. Electron J Stat 9:1540–1561
Hall P (1991) Edgeworth expansions for nonparametric density estimators, with applications. Statistics 22:215–232
Hall P (1992) Effect of bias estimation on coverage accuracy of bootstrap confidence intervals for a probability density. Ann Stat 20:675–694
Hall P, Titterington MD (1988) On confidence bands in nonparametric density estimation and regression. J Multivar Anal 27:228–254
Hamilton BW (2001) Using Engel’s law to estimate CPI bias. Am Econ Rev 91:619–630
Härdle W (1989) Asymptotic maximal deviation of M-smoothers. J Multivar Anal 29:163–179
Leadbetter MR, Lindgren G, Rootzén H (1983) Extremes and related properties of random sequences and processes. Springer, New York
Ma S, Yang L, Carroll R (2012) A simultaneous confidence band for sparse longitudinal regression. Stat Sin 22:95–122
R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Ruppert D, Wand MP (1994) Multivariate weighted least squares regression. Ann Stat 22:1346–1370
Silverman WB (1986) Density estimation for statistics and data analysis. Chapman and Hall, London
Song Q, Liu R, Shao Q, Yang L (2014) A simultaneous confidence band for dense longitudinal regression. Commun Stat Theory Methods 43:5195–5210
Song Q, Yang L (2009) Spline confidence bands for variance functions. J Nonparametr Stat 5:589–609
Stapleton J (2009) Linear statistical models, 2nd edn. Wiley, Hoboken
Stone CJ (1977) Consistent nonparametric regression. Ann Stat 5:595–645
Stone CJ (1980) Optimal rates of convergence for nonparametric estimators. Ann Stat 8:1348–1360
Tusnády G (1977) A remark on the approximation of the sample DF in the multidimensional case. Period Math Hung 8:53–55
Wang J, Cheng F, Yang L (2013) Smooth simultaneous confidence bands for cumulative distribution functions. J Nonparametr Stat 25:395–407
Wang J, Liu R, Cheng F, Yang L (2014) Oracally efficient estimation of autoregressive error distribution with simultaneous confidence band. Ann Stat 42:654–668
Wang J, Wang S, Yang L (2016) Simultaneous confidence bands for the distribution function of a finite population and its superpopulation. TEST 25:692–709
Wang J, Yang L (2009) Polynomial spline confidence bands for regression curves. Stat Sin 19:325–342
Wu W, Zhao Z (2007) Inference of trends in time series. J R Stat Soc Ser B 69:391–410
Xia Y (1998) Bias-corrected confidence bands in nonparametric regression. J R Stat Soc Ser B 60:797–811
Zhao Z, Wu W (2008) Confidence bands in nonparametric time series regression. Ann Stat 36:1854–1878
Zheng S, Liu R, Yang L, Härdle W (2016) Statistical inference for generalized additive models: simultaneous confidence corridors and variable selection. TEST 25:607–626
Zheng S, Yang L, Härdle W (2014) A smooth simultaneous confidence corridor for the mean of sparse functional data. J Am Stat Assoc 109:661–673
Acknowledgements
This work has been supported in part by Jiangsu Key-Discipline Program ZY107992, National Natural Science Foundation of China award 11371272, and Research Fund for the Doctoral Program of Higher Education of China award 20133201110002. The authors thank two Reviewers, Editor-in-Chief Ana Militino, Prof. Qin Shao, and participants at the First PKU-Tsinghua Colloquium On Statistics for helpful comments.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
Throughout this section, for any function \(g\left( u\right) \), define \( \left\| g\right\| _{\infty }=\sup \nolimits _{u\in \mathcal {I} _{n}}\left| g\left( u\right) \right| \). For any vector \(\xi \), one denotes by \(\left\| \xi \right\| \) the Euclidean norm and \(\left\| \xi \right\| _{\infty }\) means the largest absolute value of the elements. We use C to denote any positive constants in the generic sense. A random sequence \(\left\{ X_{n}\right\} \) “bounded in probability” is denoted as \(X_{n}=\mathcal {O}_{p}\left( 1\right) \), while \(X_{n}=o_{p}\left( 1\right) \) denotes convergence to 0 in probability. A sequence of random functions which are \(o_{p}\) or \( \mathcal {O}_{p}\) uniformly over \(x\in \mathcal {I}_{n}\) denoted as \(u_{p}\) or \(U_{p}\).
Next, we state the strong approximation Theorem of Tusnády (1977). It will be used later in the proof of Lemmas 4 and 5.
Let \(U_{1},\ldots ,U_{n}\) be i.i.d r.v.’s on the 2-dimensional unit square with \(\mathbb {P}\left( U_{i}<\mathbf {t}\right) =\lambda \left( \mathbf {t} \right) ,\mathbf {0}\le \mathbf {t}\le \mathbf {1}\), where \(\mathbf {t}=\) \( \left( t_{1},t_{2}\right) \) and \(\mathbf {1}=\left( 1,1\right) \) are 2-dimensional vectors, \(\lambda \left( \mathbf {t}\right) =t_{1}t_{2}\). The empirical distribution function \(F_{n}^{u}\left( \mathbf {t}\right) =n^{-1}\sum _{i=1}^{n}I_{\left\{ U_{i}<\mathbf {t}\right\} }\) for \(\mathbf {0}\le \mathbf {t}\le \mathbf {1}\).
Lemma 1
The 2-dimensional Brownian bridge \(B\left( \mathbf {t}\right) \) is defined by \(B\left( \mathbf {t}\right) =W\left( \mathbf {t}\right) -\lambda \left( \mathbf {t}\right) W\left( \mathbf {1} \right) \) for \(\mathbf {0\le t\le 1}\), where \(W\left( \mathbf {t}\right) \) is a 2-dimensional Wiener process. Then there is a version \(B_{n}\left( \mathbf {t}\right) \) of \(B\left( \mathbf {t}\right) \) such that
holds for all x, where \(C, K, \lambda \) are positive constants.
The Rosenblatt transformation for bivariate continuous \(\left( X,\varepsilon \right) \) is
then \(\left( X^{*},\varepsilon ^{*}\right) \) has uniform distribution on \(\left[ a,b\right] ^{2}\); therefore,
with \(F_{n}\left( x,\varepsilon \right) \) denoting the empirical distribution of \(\left( X,\varepsilon \right) \). Lemma 1 implies that there exists a version \(B_{n}\) of 2 -dimensional Brownian bridge such that
Lemma 2
Under Assumptions (A2) and (A5), there exists \(\alpha _{1}>0\) such that the sequence \(D_{n}=n^{\alpha _{1}}\) satisfies
For such a sequence \(\left\{ D_{n}\right\} \),
Lemma 3
Under Assumptions (A1)–(A5), as \(n\rightarrow \infty \),
Proof
From the definition of \(\tilde{\rho }_{\mathrm {LQ}}(x)\) in ( 15), the Taylor series expansions, and \(\hat{\beta }\left( x\right) -\beta \left( x\right) =U_{p}\left( n^{-1/2}h_{1}^{-3/2}\log ^{1/2}n+h_{1}^{2}\right) \), one has
Write \(\mathbf {Y}\) as the sum of a signal vector \(\varvec{\mu =}\left\{ \mu \left( X_{1}\right) ,\ldots ,\mu \left( X_{n}\right) \right\} ^{\scriptstyle { T}}\) and a noise vector \(\mathbf {E=}\left\{ \sigma \left( X_{1}\right) \varepsilon _{1},\ldots ,\sigma \left( X_{n}\right) \varepsilon _{n}\right\} ^{ \scriptstyle {T}}\),
The local quadratic estimator \(\hat{\beta }\left( x\right) \) has a noise and bias error decomposition
in which the bias term I(x) and noise term II(x) are
where \(e_{k}, k=0,1,2\), as defined in (8), \(\mathbf {X}\) in (9), \(\mathbf {W}\) in (10), \(\varvec{\mu }\) and \( \mathbf {E}\) in (34). Standard arguments from kernel smoothing theory yield that
in which \(\mu _{3}\left( K^{*}\right) =\int v^{3}K^{*}\left( v\right) \mathrm{d}v\). Likewise,
Putting together (33), (37) and (38) completes the proof of the lemma.
Now from Lemma 3, one can rewrite (32) as
in which the process
Define next four stochastic processes, which approximate each other in probability uniformly over \(\mathcal {I} _{n}\) or have the exact same distributions over \(\mathcal {I}_{n}\). More precisely, with \(D_{n}\) defined in Lemma 2, and \(B_{n}\) in Lemma 1, let
where
and satisfies that
Lemma 4
Under Assumptions (A2)–(A5), as \(n\rightarrow \infty ,\)
where, for \(x\in \mathcal {I}_{n},\)
Proof
Using notations from Lemma 1, the processes Y(x) defined in (40) and \(Y^{D}(x)\) can be written as
The tail part \(Y\left( x\right) -Y^{D}\left( x\right) \) is bounded uniformly over \(\mathcal {I}_{n}\) by
By (31) in Lemma 2 and Borel–Cantelli Lemma, the first term in Eq. (47) is \(\mathcal {O}_{a.s.}\left( n^{-a}\right) \) for any \(a>0\), for instance \(a=100\), and the second term in Eq. (48) is bounded by
Lemma 5
Under Assumptions (A2)–(A5), as \(n\rightarrow \infty ,\)
Proof
First, \(\left| Y^{D}\left( x\right) -Y_{0}\left( x\right) \right| \) can be written as
which becomes the following via integration by parts
Next, from the strong approximation result in Eq. (30) and the first condition in Lemma 2, \( \sup \nolimits _{x\in \mathcal {I}_{n}}\left| Y^{D}\left( x\right) -Y_{0}\left( x\right) \right| \) is bounded by
thus completing the proof of the lemma.
Lemma 6
Under Assumptions (A2)–(A5), as \(n\rightarrow \infty ,\)
Proof
Based on Rosenblatt transformation \(M(x,\varepsilon )\) defined in Eq. (29) and according to Lemma , the term \(\left| Y_{0}\left( x\right) -Y_{1}\left( x\right) \right| \) is bounded by
The next lemma expresses the distribution of \(Y_{1}\left( x\right) \) in terms of one-dimensional Brownian motion.
Lemma 7
The process \(Y_{1}\left( x\right) \) has the same distribution as \(Y_{2}\left( x\right) \) over \(x\in \mathcal {I}_{n}.\)
Proof
By definitions, \(Y_{1}\left( x\right) \) defined in (42) and \(Y_{2}\left( x\right) \) in (43) are Gaussian processes with zero mean and unit variance. They have the same covariance functions as
Hence, according to Itô’s Isometry Theorem, they have the same distribution.
Lemma 8
Under Assumptions (A2)–(A5), as \(n\rightarrow \infty ,\)
Proof
By the aforementioned condition in Lemma 2 and Eq. (44), \(\sup \nolimits _{x\in \mathcal {I}_{n}}\left| Y_{2}\left( x\right) \right. \left. -Y_{3}\left( x\right) \right| \) is almost surely bounded by
where the term \(III\left( x\right) \) is bounded by
and the term \(IV\left( x\right) \) is bounded by
Putting together the above, one obtains that
completing the proof of this lemma.
Proof of Proposition 1
The absolute maximum of \(\left\{ Y_{3}\left( x\right) ,x\in \mathcal {I}_{n}\right\} \) is the same as that of
For process \(\xi \left( x\right) =\int K^{*}(v-x)\mathrm{d}W_{n}\left( v\right) \), \(x\in \left[ ah_{1}^{-1}+1, bh_{1}^{-1}-1\right] \), the correlation function is
which implies that
Define next a Gaussian process \(\varsigma \left( t\right) ,0\le t\le T=T_{n}=\left( b-a\right) /h_{1}-2,\)
which is stationary with mean zero and variance one, and covariance function
with \(C=C_{K^{*\prime }}/2C_{K^{*}}\). Then applying Theorems 11.1.5 and 12.3.5 of Leadbetter et al. (1983), one has for \(h_{1}\rightarrow 0\) or \( T\rightarrow \infty \),
where \(a_{T}=\left( 2\log T\right) ^{1/2}\) and \(b_{T}=\) \(a_{T}+a_{T}^{-1} \left\{ \sqrt{C\left( K^{*}\right) }/2\pi \right\} \). Note that for \( a_{h_{1}},b_{h_{1}}\) defined in (21), as \(n\rightarrow \infty \),
Hence, applying Slutsky’s Theorem twice, one obtains that
converges in distribution to the same limit as \(a_{T}\left\{ \sup \nolimits _{t\in \left[ 0,T\right] }\left| \varsigma \left( t\right) \right| -b_{T}\right\} \). Thus,
Next applying Lemma 8 and Slutsky’s Theorem, \(\forall z\in \mathbb {R},\)
Furthermore, applying Lemma 7 and Slutsky’s Theorem, the limiting distribution (50) is the same as
Furthermore, applying Lemmas 1 to 6 and Slutsky’s Theorem, one obtains
By taking \(1-\alpha =e^{-2e^{-z}}\) for \(\alpha \in \left( 0,1\right) \), the above (51) implies that
Thus, an infeasible SCB for \(\rho \left( x\right) \) over \(\mathcal {I}_{n}\) is
which establishes Proposition 1.
Proof of Theorem 1
Applying Taylor expansion to \(\hat{\rho }_{\mathrm {\mathrm {LQ}}}(x)-\tilde{\rho }_{\mathrm {LQ} }\left( x\right) \), its asymptotic order is the lower of \(\hat{\sigma } _{1}^{2}-\sigma _{1}^{2}\) and \(\hat{\sigma }_{\mathrm {SK}}^{2}(x)-\sigma ^{2}\left( x\right) \). While \(\hat{\sigma }_{1}^{2}-\sigma _{1}^{2}=\mathcal {O }_{p}\left( n^{-1/2}\right) ,\sup \nolimits _{x\in \left[ a+h_{2},b-h_{2} \right] }\left| \hat{\sigma }_{\mathrm {SK}}^{2}(x)-\sigma ^{2}\left( x\right) \right| \) is of order \(\mathcal {O}_{p}\left( n^{-1/2}h_{2}^{-1/2}\log ^{1/2}n\right) \) according to Cai and Yang (2015), and of order \(o_{p}\left( n^{-1/2}h_{1}^{-3/2}\log ^{-1/2}n\right) \) by applying (16). As (17) entails that \(\mathcal {I}_{n}\subset \left[ a+h_{2},b-h_{2}\right] \) for large enough n, one has
and thus \(\sup \nolimits _{x\in \mathcal {I}_{n}}\left| \hat{\rho }_{\mathrm { \mathrm {LQ}}}(x)-\tilde{\rho }_{\mathrm {LQ}}\left( x\right) \right| =o_{p}\left( n^{-1/2}h_{1}^{-3/2}\log ^{-1/2}n\right) \). Hence, the proof of the theorem is complete.
Proof of Theorem 2
Proposition 1, Theorem 1, and repeated applications of Slutsky’s Theorem entail that
which yields the oracle SCB for \(\rho \left( x\right) \) over \(\mathcal {I} _{n} \) in Theorem 2.
Rights and permissions
About this article
Cite this article
Zhang, Y., Yang, L. A smooth simultaneous confidence band for correlation curve. TEST 27, 247–269 (2018). https://doi.org/10.1007/s11749-017-0543-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-017-0543-5