Large panels with common factors and spatial correlation☆
Introduction
Over the past few years there has been a growing literature, both empirical and theoretical, on econometric analysis of panel data models with cross-sectionally dependent error processes. Such cross-correlations can arise for a variety of reasons, such as omitted common factors, spatial spill-overs, and interactions within socioeconomic networks. Conditioning on variables specific to the cross-section units alone does not deliver cross-section error independence; an assumption required by the standard literature on panel data models. In the presence of such dependence, conventional panel estimators such as fixed or random effects can result in misleading inference and even inconsistent estimators (Phillips and Sul, 2003). Further, conventional panel estimators may be inconsistent if regressors are correlated with unobserved common factors that might be causing the error cross-section dependence (Andrews, 2005).
Currently, there are two main strands in the literature for dealing with error cross-section dependence in panels where is large relative to , namely the residual multifactor and the spatial econometric approaches. The multifactor approach assumes that the cross-dependence can be characterized by a finite number of unobserved common factors, possibly due to economy-wide shocks that affect all units, albeit with different intensities. Under this framework, the error term is a linear combination of few common time-specific effects with heterogeneous factor loadings plus an idiosyncratic (individual-specific) error term. Estimation of a panel with such a multifactor residual structure can be addressed by using statistical techniques commonly adopted in factor analysis, such as the maximum likelihood (Robertson and Symons, 2000, Robertson and Symons, 2007), and the principal components procedures (Coakley et al., 2002, Bai, 2009). Recently, Pesaran (2006) has suggested an estimation method, referred to as Common Correlated Effects (CCE), that consists of approximating the linear combinations of the unobserved factors by cross-section averages of the dependent and explanatory variables and then running standard panel regressions augmented with these cross-section averages. An advantage of this approach is that it yields consistent estimates under a variety of situations, such as serial correlation in errors, unit roots in the factors and possible contemporaneous dependence of the observed regressors with the unobserved factors (Coakley et al., 2006, Kapetanios and Pesaran, 2007, Kapetanios et al., 2011).
The spatial approach assumes that the structure of the cross-section correlation is related to location and distance among units, defined according to a pre-specified metric. Proximity need not be measured in terms of physical space, but can be defined using other types of metrics, such as economic (Conley, 1999, Pesaran et al., 2004), policy, or social distance (Conley and Topa, 2002). Hence, cross-section correlation is represented by means of a spatial process, which explicitly relates each unit to its neighbours (Whittle, 1954). Estimation of panels with spatially correlated errors can be based on maximum likelihood (ML) techniques (Lee, 2004), or on the generalized method of moments (GMM) (Kelejian and Prucha, 1999, Lee, 2007, Kelejian and Prucha, 2009). Recently, non-parametric methods based on heteroskedasticity and autocorrelation consistent estimators applied to spatial models have also been proposed (Conley, 1999, Kelejian and Prucha, 2007, Bester et al., 2009).
In this paper we build on the existing literature and consider a general panel data model where error cross-section dependence is due to unobserved common factors and/or spatial dependence, whilst at the same time allow for the errors to be serially correlated. We focus on estimation and inference procedures that are robust to the presence of various forms of cross-sectional and temporal dependencies in the error processes. Robust methods are needed because the source and extent of error cross-section dependence is often unknown. The error cross-section dependence can take many different forms and its nature could differ at micro and macro levels. For instance, at a micro level, individual consumption behaviour can be influenced by economy-wide factors, such as changes in taxation and interest rates, and by local neighbourhood effects such as keeping up with the Jones’s (Cowan et al., 2004). In macroeconomics, several studies have argued business cycle fluctuations could be the result of both strategic interactions as well as aggregate technological shocks (Cooper and Haltiwanger, 1996). Our econometric specification, by allowing for the presence of both sources of contemporaneous error correlations, is sufficiently general and includes the models proposed in the literature as special cases.
We focus on estimation of slope coefficients in the case of a number of different specifications. Initially, we concentrate on a panel data model without unobserved factors where the errors are spatially dependent and possibly serially correlated, and derive the asymptotic distribution of the mean group and pooled estimators, under alternative assumptions regarding the slope coefficients. In the presence of heterogeneous slopes, we show that the non-parametric approach advanced by Pesaran (2006) continues to be applicable and can be used to obtain standard errors that are robust to both spatial and serial error correlations. However, in the case of homogeneous slopes the CCE procedure will not be applicable. In this case we propose a non-parametric variance matrix estimator that adapts the Newey and West (1987)’s heteroskedasticity autocorrelation consistent (HAC) procedure to allow for the spatial effects along the lines recently advanced by Kelejian and Prucha (2007). We refer to this variance estimator as spatial, heteroskedasticity, autocorrelation (SHAC) estimator. We then consider the more general case where the error term in the panel data model is composed of a multifactor structure and a spatial process, and show that Pesaran’s CCE approach continues to be valid and yields consistent estimates of the slope coefficients and their standard errors. We also show how to obtain consistent estimates of the errors in the panel to be used in tests of cross-section independence, and for further analysis of the underlying spatial processes.
Using Monte Carlo techniques, we investigate the small sample performance of the estimators under various patterns of error cross-section dependence, with and without error serial correlation, under both cases of heterogeneous and homogeneous slopes. We examine the performance of the alternative estimators when the errors only display spatial dependence, when they are subject to unobserved common factors as well as spatial dependence, and in the case where the source of cross-section dependence changes over time. Our results indicate that the mean group and pooled estimators with robust standard errors do work well under certain regularity conditions outlined in our theorems. However, under slope homogeneity or in the presence of unobserved common factors these estimators fail to provide the correct inference. The results also document the tendency of the tests based on HAC type standard errors to over reject the null hypothesis in small samples even in the case of error cross-section dependence which is purely spatial. In contrast, our Monte Carlo experiments clearly show that the augmentations of panel regressions with cross-section averages, as formulated by the CCE procedure, eliminates the effects of all forms of spatial and temporal correlations, irrespective of whether these are due to spatial and/or unobserved common factors. The small sample properties of CCE estimators do not seem to be affected by the heterogeneity assumptions on slope coefficients, or by the presence of error serial correlations. It is this level of robustness of the CCE estimator which particularly commends it for use in empirical analysis.
The plan of the remainder of the paper is as follows: Section 2 sets out a panel regression model with unobserved common factors and general spatial and temporal error processes. Section 3 develops the asymptotic distribution of the mean group and pooled estimators in the presence of spatial error dependence and error serial correlation. Section 4 considers the more general case where the errors also contain unobserved common factors, and establishes the validity of the CCE estimators for this class of models. Consistent estimation of the residuals from such models is considered in Section 5, where the necessary identification conditions are stated. Section 6 describes the Monte Carlo experiments and report the results. Section 7 ends with some concluding remarks.
Notation: are the eigenvalues of a matrix , where is the space of real matrices. denotes a generalized inverse of . The column norm of is . The row norm of is . The Euclidean norm of is . is used for a fixed positive constant. denotes and tending to infinity jointly but in no particular order.
Section snippets
Heterogenous panels with unobserved common factors and spatial error correlation
We begin with a general specification where the dependent variable is a function of a set of individual-specific regressors, a linear combination of common observed and unobserved factors, and includes errors that are serially and spatially correlated. Let be the observation on the th cross-section unit at time for ; , and suppose that it is generated as where is a vector of observed common effects, and is a
Estimating panels with spatial error correlation
The literature on spatial econometrics typically considers the problem of spatial dependence under strong assumptions of homogeneity and temporal independence. Only recently, a strand of literature in spatial econometrics has considered the incorporation of unobserved heterogeneity in spatial panel data models, where is usually assumed to be large relative to . Baltagi et al. (2003) and Kapoor et al. (2007) have focused on ML and GMM estimation of panels where the error term is the sum of an
Estimating panels with unobserved common factors and spatial error correlation
We now turn to the estimation of the slope coefficients in the context of panels with both common factors and spatial error dependence. We restrict our attention to the CCE approach since, as compared to other existing methods, it is simple to apply and has been shown to be robust to the choice of (the number of common factors), the temporal dynamics of unobserved common factors, and the idiosyncratic error. The idea underlying this approach is that, as far as estimation of the slope
Residuals from CCE regression
We now consider the consistent estimation of regression errors in model (1). Estimation of is needed for computing tests of error cross-section independence, or when the objects of interest are the coefficients of the spatial process, . Before continuing, without loss of generality, we specify some further assumptions on the observed and unobserved common factors. In particular: Assumption 11 , for , and the vector of observed common factors, , is distributed
Monte Carlo design
This section provides Monte Carlo evidence on the small sample properties of our estimators, under a range of assumptions on the stochastic process generating the error terms. The study is comprised of three sets of experiments. In the first set, we consider a panel where the error term is generated by a SAR process and with no common factors. In the second set, we assume that the error process is the orthogonal sum of a factor structure and a spatial process, and allow the dependent variable
Concluding remarks
The main aim of this paper has been to consider estimation of a panel regression model under a number of different specifications of cross-section error correlations, such as spatial and/or common factor models. We have derived the asymptotic distributions of the mean group and pooled estimators for a panel regression model where the source of error cross-section dependence is purely spatial or results from omitted unobserved factors, or both. In each case we have distinguished between panels
References (48)
- et al.
Testing panel data regression models with spatial error correlation
Journal of Econometrics
(2003) - et al.
Unobserved heterogeneity in panel time series
Computational Statistics and Data Analysis
(2006) GMM estimation with cross sectional dependence
Journal of Econometrics
(1999)- et al.
An unbalanced spatial panel data approach to US state tax competition
Economics Letters
(2005) - et al.
Panels with nonstationary multifactor error structures
Journal of Econometrics
(2011) - et al.
Panel data models with spatially correlated error components
Journal of Econometrics
(2007) - et al.
HAC estimation in a spatial framework
Journal of Econometrics
(2007) GMM and 2SLS estimation of mixed regressive, spatial autoregressive models
Journal of Econometrics
(2007)- et al.
Estimation of spatial autoregressive panel data models with fixed effects
Journal of Econometrics
(2010) - et al.
Estimating long-run relationships from dynamic heterogeneous panels
Journal of Econometrics
(1995)
Maximum likelihood factor analysis with rank-deficient sample covariance matrices
Journal of Multivariate Analysis
Cross section regression with common shocks
Econometrica
Spatial Econometrics: Methods and Models
Panel data models with interactive fixed effects
Econometrica
Student’s -test for Gaussian scale mixtures
Journal of Mathematical Sciences
Matrix Mathematics: Theory, Facts, and Formulas with Application to Linear Systems Theory
A Course in Probability Theory
Socio-economic distance and spatial patterns in unemployment
Journal of Applied Econometrics
Evidence on macroeconomic complementarities
The Review of Economics and Statistics
Waves in consumption with interdependence among consumers
Canadian Journal of Economics
Cited by (377)
Female political empowerment and green finance
2024, Energy EconomicsCross-section bootstrap for CCE regressions
2024, Journal of EconometricsEconomic integration and consumption risk sharing: A comparison of Eurozone and OECD countries
2024, International Review of Economics and FinanceLinear panel regressions with two-way unobserved heterogeneity
2023, Journal of Econometrics
- ☆
We are grateful to the Editor (Cheng Hsiao), an Associate Editor and three anonymous referees, Badi Baltagi, Alexander Chudik and George Kapetanios for helpful comments and suggestions. Elisa Tosetti acknowledges financial support from ESRC (Ref. no. RES-061-25-0317).