Nonparametric trending regression with cross-sectional dependence

https://doi.org/10.1016/j.jeconom.2012.01.005Get rights and content

Abstract

Panel data, whose series length T is large but whose cross-section size N need not be, are assumed to have common time trend, of unknown form. The model includes additive, unknown, individual-specific components and allows for spatial or other cross-sectional dependence and/or heteroscedasticity. A simple smoothed nonparametric trend estimate is shown to be dominated by an estimate which exploits availability of cross-sectional data. Asymptotically optimal bandwidth choices are justified for both estimates. Feasible optimal bandwidths, and feasible optimal trend estimates, are asymptotically justified, finite sample performance of the latter being examined in a Monte Carlo study. Potential extensions are discussed.

Introduction

Much econometric modelling of nonstationary time series employs deterministic trending functions that are polynomial, indeed frequently linear. However the penalties of mis-specifying parametric functions are well appreciated, and nonparametric modelling is increasingly widely accepted, at least in samples of reasonable size. The inability of polynomials to satisfactorily globally approximate general functions of time deters study of polynomial functions whose order increases slowly with sample size, and rather leads one to consider the possibility of a smooth trend mapped into the unit interval and approximated by a smoothed kernel regression. For example, Starica and Granger (2005) employed this approach in modelling series of stock prices. There is a huge literature on such fixed-design nonparametric regression, principally in the setting of a single time series.

Here we are concerned with panel data, where N series of length T have a common, nonparametric, time trend but also additive, fixed, individual effects, for which we have to correct before being able to form a trend estimate. We assume an asymptotic framework in which T is large, but not necessarily N, so that the cross-sectional mean at a given time point is not necessarily consistent for the trend, hence the recourse to smoothed nonparametric regression. A major feature of the paper is concern for possible cross-sectional correlation and/or heteroscedasticity. These influence the asymptotic variance of our trend estimate, and thence also the mean squared error and consequent optimal rules for bandwidth choice. The availability of cross-sectional data enables us to propose a trend estimate, based on the generalized least squares principle, that reduces the asymptotic variance. This estimate, along with its asymptotic variance (and that of the original trend estimate), depends on the cross-sectional covariance matrix. In general this is not wholly known, and possibly not known at all. Using residuals from the fitted trend, we consistently estimate its elements, so as to obtain a feasible improved trend estimate, and a consistent estimate of its variance, as well as feasible optimal bandwidths that are asymptotically equivalent to the infeasible versions. These results are valid with N remaining fixed as T increases, and they continue to hold if N is also allowed to increase, in which case there is a faster rate of convergence, and in this latter situation our results hold irrespective of whether or not the covariance matrix is finitely parameterized. Peter Phillips has made seminal contributions to research in both panel data and nonparametric estimation, among other areas.

Section 2 describes the basic model. In Section 3 we present a simple trend estimate and its mean squared error properties. Improved estimation is discussed in Section 4. In Section 5 optimal bandwidths are reported. Section 6 suggests estimates of the cross-sectional covariance matrix, with asymptotic justification for their insertion in the optimal bandwidths and improved trend estimates. Section 7 suggests some directions for further research. Proof details may be found in two appendices.

Section snippets

Panel data nonparametric model

We observe yit,i=1,,N,t=1,,T, generated by yit=αi+βt+xit, where the αi and βt are unknown constants, and the xit are unobservable zero-mean random variables, uncorrelated and homoscedastic across time, but possibly correlated and heteroscedastic over the cross section. Thus we impose

Assumption 1

For all i,t, E(xit)=0; for all i,j,t there exist finite constants ωij such that E(xitxjt)=ωij; and for all i,j,t,u, E(xitxju)=0,tu.

Our focus is on estimating the time trend. Superficially this is represented in

Simple trend estimation

We introduce a kernel function k(u),<u<, satisfying k(u)du=1, and a positive scalar bandwidth h=hT. Then with the abbreviation ktτ=k(TτtTh), define the estimate β̃(τ)=t=1TktτȳAt/t=1Tktτ,τ(0,1).

Important measures of goodness of nonparametric estimates, which lead to optimal choices of bandwidth h, are mean squared error, i.e. MSE{β̃(τ)}=E{β̃(τ)β(τ)}2, and mean integrated squared error, i.e. MISE{β̃}=01E{β̃(u)β(u)}2du.

To approximate these we require conditions on β,k and h.

Assumption 2

β(τ) is

Improved trend estimation

Improved estimation of the trend requires it to be identified in a different way from that in Section 2, in particular to shift its location. Consider the representation yit=αi(w)+βt(w)+xit, where the bracketed superscript w represents a vector w=(w1,,wn) of weights, such that w=1,wα(w)=0, where is a N×1 vector of 1’s. This represents a generalization of (2.1), (2.5), in which w=(1/N,,1/N). It is convenient to write (4.1), for i=1,,N, in N-dimensional column vector form as yt=α(w)+βt(

Optimal bandwidth choice

A key question in implementing either β̃ or ρ̃ is the choice of bandwidth h. Choices that are optimal in the sense of minimizing asymptotic MSE or MISE are conventional. The following theorem differs only from well known results in indicating the dependence of the optimal choices on Ω and N, and so again no proof is given.

Theorem 3

Let Assumption 1, Assumption 2, Assumption 3, Assumption 4, Assumption 5 hold. The h minimizing asymptotic MSE and MISE of β̃ are respectivelyhβ,MSE(τ)={κΩTN2/χ2ζ(τ)}1/5,hβ

Feasible optimal bandwidth choice and trend estimation

In practice the optimal bandwidths of the previous section cannot be computed. The constants κ and χ are trivially calculated, but ζ(τ) and ξ are unknown. Discussion of their estimation can be found in the nonparametric smoothing literature, see e.g. Gasser et al. (1991), and there is nothing about our setting to require additional treatment here, apart from the improved estimation possible by averaging over the cross section. More notable is the need to approximate the partly or wholly unknown

Monte Carlo study of finite sample performance

As always when large sample asymptotic results are presented, the issue of finite-sample relevance arises. In the present case, one interesting question is the extent to which ρˆ(τ) matches the efficiency of ρ̃(τ), and whether it is actually better than β̃(τ), given the sampling error in estimating Ω. We study this question by Monte Carlo simulations in the case where Ω has the factor structure (4.19).

In (2.1), we thus take xit=biηi+aεit, where the ηi and εit have mean zero and variance 1, and

Further directions for research

  • 1.

    Our asymptotic variance formulae for βˆ(τ) and ρˆ(τ) appear also in central limit theorems, under some additional conditions, indeed one could develop joint central limit theorems for both β̃(τ) and ρ̃(τ) at finitely many, r, fixed frequencies τ1,,τr, with asymptotic independence across the τi. When the bias is negligible relative to the standard deviation, the convergence rate will be (Th)12 when N is fixed, and faster if N is allowed to increase with T.  We could also develop a central limit

Acknowledgements

This research was supported by ESRC Grant RES-062-23-0036. I am grateful for the helpful comments of a referee and Jungyoon Lee, and to the latter also for carrying out the simulations.

References (10)

  • P.M. Robinson

    Asymptotic properties of nonparametric regression with spatial data

    Journal of Econometrics

    (2011)
  • J. Bai et al.

    Determining the number of factors in approximate factor models

    Econometrica

    (2002)
  • J.K. Benedetti

    On the nonparametric estimation of regression functions

    Journal of the Royal Statistical Society, Series B

    (1977)
  • T. Gasser et al.

    A flexible and fast method for automatic smoothing

    Journal of the American Statistical Association

    (1991)
  • J.D. Hart et al.

    Kernel regression estimation using repeated measurements data

    Journal of the American Statistical Association

    (1986)
There are more references available in the full text version of this article.

Cited by (39)

  • Heterogeneous panel data models with cross-sectional dependence

    2020, Journal of Econometrics
    Citation Excerpt :

    In addition to the individual response, the underlying time-varying feature, such as technology growth or climate change captured by trending functions (see Robinson, 2012; Chen et al., 2012), can also vary across individuals.

  • Inference on trending panel data

    2018, Journal of Econometrics
  • Nonparametric testing for smooth structural changes in panel data models

    2018, Journal of Econometrics
    Citation Excerpt :

    Recently, a time-varying parameter panel data model has appeared as a novel tool to identify the trend function and capture the evolutionary behavior of economic relationship. Robinson (2012) introduces a nonparametric trending regression for panel data with cross-sectional dependence and considers a simple nonparametric trend estimate. Chen et al. (2012) extend Robinson’s (2012) work to the semiparametric partially linear panel data model where all individuals share a common trend.

View all citing articles on Scopus
View full text