Testing multivariate economic restrictions using quantiles: The example of Slutsky negative semidefiniteness

https://doi.org/10.1016/j.jeconom.2015.07.004Get rights and content

Abstract

This paper is concerned with testing a core economic restriction, negative semidefiniteness of the Slutsky matrix. We consider a system of nonseparable structural equations with infinite dimensional unobservables, and employ quantile regression methods because they allow us to utilize the entire distribution of the data. Difficulties arise because the restriction involves several equations, while the quantile is a univariate concept. We establish that we may use quantiles of linear combinations of the dependent variable, develop a new empirical process based test that applies kernel quantile estimators, and investigate its finite and large sample behavior. Finally, we apply all concepts to Canadian microdata.

Introduction

Economic theory yields strong implications for the actual behavior of individuals. In the standard utility maximization model for instance, economic theory places strong restrictions on individual responses to changes in prices and wealth, the so-called integrability constraints. These restrictions are inherently restrictions on individual level: They have to hold for every preference ordering and every single individual, at any price wealth combination. Other than obeying these restrictions, the individuals’ idiosyncratic preference orderings may exhibit a lot of differences. Indeed, standard parametric cross section mean regression methods applied to consumer demand data often exhibit R2 between 0.1 and 0.2. Today, the consensus is that the majority of the unexplained variation is precisely due to unobserved preference heterogeneity. For this reason, the literature has become increasingly interested in exploiting all the information about unobserved heterogeneity contained in the data, in particular using the quantiles of the dependent variable.

To lay out our model, let y denote the L1 vector of quantities demanded. At this stage, we have already imposed the adding up constraint (i.e., out of L goods we have deleted the last). Let p denote the L vector of prices, and x denote income (total expenditure).1 For every individual, define the cost function C(p,u) to give the minimum cost to attain utility level u facing the L-vector of prices p, and given income (more precisely, total outlay) x. The Slutsky negative semidefiniteness restriction arises from the fact that the cost function is concave, and hence the matrix of second derivatives is negative semidefinite (nsd, henceforth). For brevity, we will sometimes equate negative semidefiniteness with “rationality”, even though it is only one facet of rationality in this setup with linear budget constraint.2

Obviously, this hypothesis has to hold for any preference ordering u. However we do not observe the individual’s preference ordering u, and only observe a K dimensional vector of household covariates (denoted q). Specifically, we assume to have niid observations on individuals from an underlying heterogeneous population characterized by random variables U,Y,X,P,Q which have a nondegenerate joint distribution FU,Y,X,P,Q.

The question of interest is now as follows: What can we learn from the observable part of this distribution, i.e.  FY,X,P,Q, about whether the Slutsky matrix is negative semidefinite across a heterogeneous population, for all values of (p,x,u). In Hoderlein (2011), we consider testing negative semidefiniteness in such a setting with mean and second moment regressions only. However, these lower order moment regressions have the disadvantage that they use only one feature of FY,X,P,Q, and not the entire distribution. Therefore, in this paper we propose to exploit the distributional information by using all the α-quantiles of the conditional distribution of observables, which (with varying α) employ all the information that may be obtained from the data about the economic hypothesis of interest.

There are two immediate difficulties now, and solving them is the major innovation this paper introduces. The first is how to relate a specific economic property in the (unobservable) world of nonseparable functions to observable regression quantiles. The second one is how to use quantiles in systems of equations. The solution for the second difficulty is to consider linear combinations of the dependent variable, i.e.  Y(b)=bY for all vectors b of unit length and consider the respective conditional α-quantiles of this quantity. This can be thought of as an analogue to the Cramer–Wold device, and is a strategy that is feasible more generally, e.g., when testing omission of variables. As b and α vary, we exploit the entire distribution of observables.

The solution to the first of these two difficulties involves obviously identifying assumptions. To this end, since we are dealing with nonseparable models we require full conditional independence, i.e., we require that U(P,X)|Q, or versions of this assumption that control for endogeneity. These assumptions are versions of the “selection on observables” assumptions in the treatment effect literature. Essentially they require that, in every subpopulation defined by Q=q, preferences as well as prices and income be independently distributed. Although endogeneity is not relevant for our application, our treatment covers the control function approach to handle endogeneity in nonseparable models discussed in Altonji and Matzkin (2005), Imbens and Newey (2009) or Hoderlein (2011), by simply adding endogeneity controls V to the set of household control variables Q. From now on, we denote by W the set of all observable right hand side variables, i.e.,  (P,X,Q), and potentially in addition V, if we are controlling for endogeneity.3

Under this assumption and some regularity conditions, our first main contribution is as follows: We establish that the rationality hypothesis in the underlying population has a testable implication on the distribution of the data, specifically, on the conditional quantiles of linear combinations of the dependent variables. Consequently, we can test a null hypothesis in the underlying (unobservable) heterogeneous population model in the sense that a rejection of the testable implication leads also to a rejection of the original null hypothesis. While this procedure controls size, though likely conservative, it may suffer from low power: If we do not reject the implication, we cannot conclude that the data could not have been generated by some other, nonrational mechanism. This is the price we pay for being completely general, as the only material assumption that we require to relate the observable object and the underlying heterogeneous population is the conditional independence assumption U(P,X)|Q, and no other material assumption on the functional form of demand or their distribution enters the model. In particular, we have not assumed any monotonicity or triangularity assumption; there can be infinitely many unobservables, and they can enter in arbitrarily complicated form; in a sense, every individual can have its own nonparametric utility function, and hence this framework is closer to a random functions setup. Assessing and testing the validity of the related weak axiom of revealed preferences in a similar random utility function setup has been investigated in Hoderlein and Stoye (2014), and Kitamura and Stoye (2014).

Our second main contribution is proposing a quantile regression based nonparametric test statistic. Specifically, we apply the sample counterparts principle to obtain a nonparametric test statistic of the testable implication, and derive its’ large sample properties. We show weak convergence of a corresponding standardized stochastic process to a Gaussian process and obtain an asymptotically valid hypothesis test. Moreover, we propose a bootstrap version of our test statistic. To avoid the generation of bootstrap observations under the null we adapt the well known idea of residual bootstrap for our specific model and use a centered version of the stochastic process. Nonparametric tests involving quantiles are surprisingly scant, and we list the closest references in the following paragraph. Specifically, in a system of equations setup we are the first to propose a quantile based test of an economic hypothesis, and to implement such a test using real world data.

Our test is a pointwise test, meaning that it holds locally for W=w0. The main reason for this is that we aim at a more detailed picture of possible rejections, providing a better description of the rationality of the population (e.g., one outcome is that negative semidefiniteness is rejected for 20% of the population (=representative positions at which the test is evaluated)).4

Literature Testing the key integrability constraints that arise out of utility maximization dates back at least to the early work of Stone (1954), and has spurned the extensive research on (parametric) flexible functional form demand systems (e.g., the Translog, cf. Jorgenson et al. (1982), and the Almost Ideal, cf.  Deaton and Muellbauer (1980)). Nonparametric analysis of some derivative constraints was performed by Stoker (1989) and Härdle et al. (1991), but none of these has its focus on modeling unobserved heterogeneity. More closely related to our approach is Lewbel (2001) who analyzes integrability constraints in a purely exogenous setting, but does not use distributional information nor suggests or implements an actual test. An alternative method for checking some integrability constraints is revealed preference analysis, see Blundell et al. (2003), and references therein. An approach that combines revealed preference arguments with a demand function structure is Blundell et al. (2011). As a side result, this paper develops a test of the weak axiom of revealed preferences, but in contrast to our paper, this paper assumes a scalar unobservable that enters monotonically in a single equation setup.

While our approach extends earlier work on demand systems, it is very much a blueprint for testing all kinds of economic hypothesis in systems of equations. Due to the nonseparable framework we employ, our approach extends the recent work on nonseparable models—in particular Hoderlein (2011), Hoderlein and Mammen (2007), Imbens and Newey (2009), Matzkin (2003). When it comes to dealing with unobserved heterogeneity, there are two strands in this literature: The first assumes triangularity and monotonicity in the unobservables (Chesher, 2003, Matzkin, 2003). The triangularity and monotonicity assumptions are, however, rather implausible for consumer demand, because in general the multivariate demand function is a nonmonotonic function of an infinite dimensional unobservable–the individuals’ preference ordering–and all equations depend on this object.

Hence we follow the second route. Extending earlier work in Hoderlein (2011),  Hoderlein and Mammen (2007) establish interpretation of the derivative of the conditional quantile (a scalar valued function!) if there is more than one unobservable. The upshot of this work is that in a world with many different sources of unobserved heterogeneity, at best conditional average effects are identified, see also Altonji and Matzkin (2005) and Imbens and Newey (2009).

In the statistics literature, the closest work we are aware of includes the testing procedures proposed by Zheng (1998), Sun (2006), Escanciano and Velasco (2010), and Dette et al. (2011), which all consider some versions of tests on regression quantiles, but there is no clear direct relationship. Finally, Wolak (1991) considers testing of inequality constraints in nonlinear parametric models. These tests are very different from ours due to the general difference between testing in parametric and nonparametric environments.

In this paper we work with quantiles of univariate linear combinations of the multivariate observations. In the literature several different approaches to define quantiles of multivariate random variables have been suggested, see Barnett (1976), Serfling (2002) and Koenker (2005), for overviews, and Hallin et al. (2010) and Belloni and Winkler (2011) for more recent approaches.

Structure of the paper: The exposition of this paper is as follows. In the next section, we introduce our model formally, state some assumptions, and derive and discuss the main identification result. In the third section, we propose a nonparametric test for the economic hypothesis of Slutsky nsd based on the principle of sample counterparts, analyze its large sample behavior and propose a bootstrap procedure to derive the critical values. We investigate the performance of the bootstrap procedure for moderate sample sizes in a simulation study in Section  4. In the fifth section, we apply these concepts to Canadian expenditure data. The results are affirmative as far as the validity of the integrability conditions are concerned and demonstrate the advantages of our framework. A summary and an outlook conclude this paper, while the Appendix contains regularity assumptions, proofs, graphs and summary statistics.

Section snippets

Assumptions and main results

Our model of consumer demand in a heterogeneous population consists of several building blocks. As is common in consumer demand, we assume that–for a fixed preference ordering–there is a causal relationship between physical quantities, a real valued random L-vector denoted by Y, and regressors of economic importance, namely prices P and total expenditure X, real valued random vectors of length L and 1, respectively, stemming from utility maximization subject to a linear budget constraint. More

Null hypothesis and testable implications

Our aim is to derive a hypotheses test for the original null hypothesis H02 as defined before, i.e. negative semi-definiteness of the Slutsky matrix in a heterogeneous population. Theorem 1 provides us with a testable implication of H02, denoted by H03 below. Throughout the section we assume we have observed independent data (Yi,Pi,Xi), i=1,,n, with the same distribution as (Y,P,X)RL1×RL1×R. We do not treat the additional conditioning on Zi as standard nonparametric results extend

Monte Carlo experiments

To analyze the finite sample performance of our test and to get a feeling for the behavior of the test in our application, we simulate data from a joint distribution that has similar features at least in terms of observables. We specify the DGP to be a linear random coefficients specification, which is arguably the most straightforward model of a heterogeneous population, and choose the distributions of coefficients such that under the null the entire population is rational, while under the

Empirical implementation

In this section we discuss all matters pertaining to the empirical implementation: More specifically, following the framework outlined in second section, we discuss the implementation of the resulting test statistic (in exactly the form introduced in the third section). The sections start with a description of the data, and then provides details of the methodology before presenting and discussing the results.

Summary and outlook

Rationality of economic agents is the central paradigm of economics. Yet, within this paradigm individuals can vary widely in their actual behavior; only the qualitative properties of individual behavior are constrained, but not the heterogeneity across individuals. Indeed, in many data sets there are large differences in observed consumer choices even for individuals which are equal in terms of their observed household covariates, like age, gender, educational background etc.

One of the core

Acknowledgments

The authors have received helpful comments from the co-editor Han Hong, an anonymous associate editor and two anonymous referees, Andrew Chesher, Roger Koenker, Dennis Kristensen, Arthur Lewbel, Rosa Matzkin, Ulrich Mueller, Whitney Newey and Azeem Shaik, as well as seminar participants at Boston College, Princeton and the Conference on Nonparametrics and Shape Constraints at Northwestern University. We are particularly indebted to Krishna Pendakur to provide us with the data. We would also

References (48)

  • J.C. Escanciano et al.

    Specification tests of parametric dynamic conditional quantiles

    J. Econometrics

    (2010)
  • B. Haag et al.

    Testing and imposing slutsky symmetry

    J. Econometrics

    (2009)
  • S. Hoderlein

    How many consumers are rational

    J. Econometrics

    (2011)
  • K. Pendakur

    Taking prices seriously in the measurement of inequality

    J. Pub. Econ.

    (2002)
  • J. Altonji et al.

    Cross section and panel data estimators for nonseparable models with endogenous regressors

    Econometrica

    (2005)
  • V. Barnett

    The ordering of multivariate data

    J. Roy. Statist. Soc. Ser. A

    (1976)
  • A. Belloni et al.

    On multivariate quantiles under partial orders

    Ann. Statist.

    (2011)
  • Y. Benjamini et al.

    Controlling the false discovery rate: a practical and powerful approach to multiple testing

    J. R. Stat. Soc. Ser. B

    (1995)
  • R. Blundell et al.

    Nonparametric engel curves and revealed preference

    Econometrica

    (2003)
  • Blundell, R., Kristensen, D., Matzkin, R., 2011. Stochastic Demand and Revealed Preference, unpublished...
  • R. Blundell et al.

    What do we learn about consumer demand patterns from micro data?

    Amer. Econom. Rev.

    (1993)
  • Browning, M., Thomas, L., 1999. Prices for the FAMEX, Working...
  • A. Chesher

    Identification in nonseparable models

    Econometrica

    (2003)
  • A. Deaton et al.

    An almost ideal demand system

    Amer. Econom. Rev.

    (1980)
  • H. Dette et al.

    Comparing conditional quantile curves

    Scand. J. Stat.

    (2011)
  • S. Duduit et al.

    Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments

    Statist. Sinica

    (2002)
  • M. Hallin et al.

    Multivariate quantiles and multiple output regression quantiles: From L1 optimization to halfspace depth

    Ann. Statist.

    (2010)
  • W. Härdle et al.

    Empirical evidence on the law of demand

    Econometrica

    (1991)
  • J. Hausman et al.

    Nonparametric estimation of exact consumers surplus and deadweight loss

    Econometrica

    (1995)
  • Y. Hochberg

    A sharper Bonferroni procedure for multiple tests of significance

    Biometrika

    (1988)
  • S. Hoderlein et al.

    Identification of marginal effects in nonseparable models without monotonicity

    Econometrica

    (2007)
  • S. Hoderlein et al.

    Identification and estimation of marginal effects in nonseparable, nonmonotonic models

    Econom. J.

    (2009)
  • S. Hoderlein et al.

    Revealed preferences in a heterogeneous population

    Rev. Econom. Stat.

    (2014)
  • S. Holm

    A simple sequentially rejective multiple test procedure

    Scand. J. Stat.

    (1979)
  • Cited by (0)

    View full text