1932

Abstract

We present an expository, general analysis of valid post-selection or post-regularization inference about a low-dimensional target parameter in the presence of a very high-dimensional nuisance parameter that is estimated using selection or regularization methods. Our analysis provides a set of high-level conditions under which inference for the low-dimensional parameter based on testing or point estimation methods will be regular despite selection or regularization biases occurring in the estimation of the high-dimensional nuisance parameter. A key element is the use of so-called immunized or orthogonal estimating equations that are locally insensitive to small mistakes in the estimation of the high-dimensional nuisance parameter. As an illustration, we analyze affine-quadratic models and specialize these results to a linear instrumental variables model with many regressors and many instruments. We conclude with a review of other developments in post-selection inference and note that many can be viewed as special cases of the general encompassing framework of orthogonal estimating equations provided in this article.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-economics-012315-015826
2015-08-02
2024-04-16
Loading full text...

Full text loading...

/deliver/fulltext/economics/7/1/annurev-economics-012315-015826.html?itemId=/content/journals/10.1146/annurev-economics-012315-015826&mimeType=html&fmt=ahah

Literature Cited

  1. Belloni A, Chen D, Chernozhukov V, Hansen C. 2012. Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80:2369–429 [Google Scholar]
  2. Belloni A, Chernozhukov V. 2011. High-dimensional sparse econometric models: an introduction. In Inverse Problems and High-Dimensional Estimation: Stats in the Château Summer School, August 31–September 2, 2009, ed. P Alquier, E Gautier, G Stoltz, pp. 121–56. New York: Springer
  3. Belloni A, Chernozhukov V. 2013. Least squares after model selection in high-dimensional sparse models. Bernoulli 19:521–47 [Google Scholar]
  4. Belloni A, Chernozhukov V, Fernández-Val I, Hansen C. 2013a. Program evaluation with high-dimensional data. arXiv:1311.2645 [math.ST]
  5. Belloni A, Chernozhukov V, Hansen C. 2010. LASSO methods for Gaussian instrumental variables models. arXiv:1012.1297 [stat.ME]
  6. Belloni A, Chernozhukov V, Hansen C. 2013b. Inference for high-dimensional sparse econometric models. In Advances in Economics and Econometrics: 10th World Congress, Vol. 3: Econometrics, ed. D Acemoglu, M Arellano, E Dekel, pp. 245–95. Cambridge, UK: Cambridge Univ. Press
  7. Belloni A, Chernozhukov V, Hansen C. 2014a. Inference on treatment effects after selection amongst high-dimensional controls. Rev. Econ. Stud. 81:608–50 [Google Scholar]
  8. Belloni A, Chernozhukov V, Hansen C, Kozbur D. 2014b. Inference in high dimensional panel models with an application to gun control. arXiv:1411.6507 [stat.ME]
  9. Belloni A, Chernozhukov V, Kato K. 2013c. Robust inference in approximately sparse quantile regression models (with an application to malnutrition). arXiv:1312.7186 [math.ST]
  10. Belloni A, Chernozhukov V, Kato K. 2013d. Uniform post selection inference for LAD regression models and other Z-estimation problems. arXiv:1304.0282 [math.ST]
  11. Belloni A, Chernozhukov V, Wang L. 2011. Square-root-LASSO: pivotal recovery of sparse signals via conic programming. Biometrika 98:791–806 [Google Scholar]
  12. Belloni A, Chernozhukov V, Wei Y. 2013e. Honest confidence regions for logistic regression with a large number of controls. arXiv:1304.3969 [stat.ME]
  13. Berk R, Brown L, Buja A, Zhang K, Zhao L. 2013. Valid post-selection inference. Ann. Stat. 41:802–37 [Google Scholar]
  14. Berry S, Levinsohn J, Pakes A. 1995. Automobile prices in market equilibrium. Econometrica 63:841–90 [Google Scholar]
  15. Bickel PJ. 1982. On adaptive estimation. Ann. Statist. 10:647–71 [Google Scholar]
  16. Bickel PJ, Ritov Y, Tsybakov AB. 2009. Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37:1705–32 [Google Scholar]
  17. Bühlmann P, van de Geer S. 2011. Statistics for High-Dimensional Data: Methods, Theory and Applications New York: Springer
  18. Candès E, Tao T. 2007. The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35:2313–51 [Google Scholar]
  19. Carrasco M. 2012. A regularization approach to the many instruments problem. J. Econom. 170:383–98 [Google Scholar]
  20. Carrasco M, Tchuente G. 2015. Regularized LIML with many instruments. J. Econom. 186:427–42 [Google Scholar]
  21. Chamberlain G. 1987. Asymptotic efficiency in estimation with conditional moment restrictions. J. Econom. 34:305–34 [Google Scholar]
  22. Chamberlain G, Imbens G. 2004. Random effects estimators with many instrumental variables. Econometrica 72:295–306 [Google Scholar]
  23. Chao JC, Swanson NR, Hausman JA, Newey WK, Woutersen T. 2012. Asymptotic distribution of JIVE in a heteroskedastic IV regression with many instruments. Econom. Theory 28:42–86 [Google Scholar]
  24. Chen LHY, Fang X. 2011. Multivariate normal approximation by Stein’s method: the concentration inequality approach. arXiv:1111.4073 [math.PR]
  25. Chen X, Linton O, Keilegom IV. 2003. Estimation of semiparametric models when the criterion function is not smooth. Econometrica 71:1591–608 [Google Scholar]
  26. Chernozhukov V, Chetverikov D, Kato K. 2013. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 41:2786–819 [Google Scholar]
  27. Chernozhukov V, Liu H, Lu J, Ning Y. 2014. Statistical inference in high-dimensional sparse models using generalized method of moments. Unpublished manuscript, Mass. Inst. Technol., Cambridge, MA, Princeton Univ., Princeton, NJ
  28. Dudley RM. 2002. Real Analysis and Probability Cambridge, UK: Cambridge Univ. Press
  29. Fan J, Li R. 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96:1348–60 [Google Scholar]
  30. Fan J, Lv J. 2010. A selective overview of variable selection in high dimensional feature space. Stat. Sin. 20:101–48 [Google Scholar]
  31. Farrell MH. 2014. Robust inference on average treatment effects with possibly more covariates than observations. arXiv:1309.4686 [math.ST]
  32. Fithian W, Sun D, Taylor J. 2014. Optimal inference after model selection. arXiv:1410.2597v1 [math.ST]
  33. Frank IE, Friedman JH. 1993. A statistical view of some chemometrics regression tools. Technometrics 35:109–35 [Google Scholar]
  34. Gautier E, Tsybakov AB. 2011. High-dimensional instrumental variables regression and confidence sets. arXiv:1105.2454v4 [math.ST]
  35. Gillen BJ, Shum M, Moon HR. 2014. Demand estimation with high-dimensional product charateristics. Adv. Econom. 34:301–23 [Google Scholar]
  36. G’Sell MG, Taylor J, Tibshirani R. 2013. Adaptive testing for the graphical lasso. arXiv:1307.4765 [math.ST]
  37. Hansen C, Kozbur D. 2014. Instrumental variables estimation with many weak instruments using regularized JIVE. J. Econom. 182:290–308 [Google Scholar]
  38. Hastie T, Tibshirani R, Friedman J. 2009. Elements of Statistical Learning: Data Mining, Inference, and Prediction New York: Springer
  39. Huber PJ. 1964. The behavior of maximum likelihood estimates under nonstandard conditions. Proc. 5th Berkeley Symp. Neyman J. 221–23 Berkeley: Univ. Calif. Press [Google Scholar]
  40. Javanmard A, Montanari A. 2014. Confidence intervals and hypothesis testing for high-dimensional regression. arXiv:1306.3171v2 [stat.ME]
  41. Jing B-Y, Shao Q-M, Wang Q. 2003. Self-normalized Cramer-type large deviations for independent random variables. Ann. Probab. 31:2167–215 [Google Scholar]
  42. Kozbur D. 2014. Inference in nonparametric models with a high-dimensional component. Work. Pap., ETH Zürich
  43. Lee JD, Sun DL, Sun Y, Taylor JE. 2013. Exact post-selection inference, with application to the lasso. arXiv:1311.6238 [math.ST]
  44. Lee JD, Taylor JE. 2014. Exact post model selection inference for marginal screening. arXiv:1402.5596 [stat.ME]
  45. Leeb H, Pötscher BM. 2008a. Recent developments in model selection and related areas. Econom. Theory 24:319–22 [Google Scholar]
  46. Leeb H, Pötscher BM. 2008b. Sparse estimators and the oracle property, or the return of Hodges’ estimator. J. Econom. 142:201–11 [Google Scholar]
  47. Lockhart R, Taylor JE, Tibshirani RJ, Tibshirani R. 2014. A significance test for the lasso. Ann. Stat. 42:413–68 [Google Scholar]
  48. Loftus JR, Taylor JE. 2014. A significance test for forward stepwise model selection. arXiv:1405.3920 [stat.ME]
  49. Meinshausen N, Yu B. 2009. Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37:2246–70 [Google Scholar]
  50. Neyman J. 1959. Optimal asymptotic tests of composite statistical hypotheses. Probability and Statistics: The Harald Cramer Volume Grenander U. 213–34 New York: Wiley [Google Scholar]
  51. Neyman J. 1979. tests and their use. Sankhya 41:1–21 [Google Scholar]
  52. Ning Y, Liu H. 2014. SPARC: optimal estimation and asymptotic inference under semiparametric sparsity. arXiv:1412.2295 [stat.ML]
  53. Okui R. 2011. Instrumental variable estimation in the presence of many moment conditions. J. Econom. 165:70–86 [Google Scholar]
  54. Pakes A, Pollard D. 1989. Simulation and asymptotics of optimization estimators. Econometrica 57:1027–57 [Google Scholar]
  55. Robins JM, Rotnitzky A. 1995. Semiparametric efficiency in multivariate regression models with missing data. J. Am. Stat. Assoc. 90:122–29 [Google Scholar]
  56. Rudelson M, Vershynin R. 2008. On sparse reconstruction from Fourier and Gaussian measurements. Commun. Pure Appl. Math. 61:1025–45 [Google Scholar]
  57. Rudelson M, Zhou S. 2011. Reconstruction from anisotropic random measurements. arXiv:1106.1151 [math.ST]
  58. Taylor J, Lockhart R, Tibshirani RJ, Tibshirani R. 2014. Exact post-selection inference for forward stepwise and least angle regression. arXiv:1401.3889 [stat.ME]
  59. Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58:267–88 [Google Scholar]
  60. van de Geer S, Bühlmann P, Ritov Y, Dezeure R. 2014. On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42:1166–202 [Google Scholar]
  61. van de Geer S, Nickl R. 2013. Confidence sets in sparse regression. Ann. Stat. 41:2852–76 [Google Scholar]
  62. van der Vaart AW. 1998. Asymptotic Statistics Cambridge, UK: Cambridge Univ. Press
  63. Voorman A, Shojaie A, Witten D. 2014. Inference in high dimensions with the penalized score test. arXiv:1401.2678 [stat.ME]
  64. Yang Z, Ning Y, Liu H. 2014. On semiparametric exponential family graphical models. arXiv:1412.8697 [stat.ML]
  65. Zhang C-H, Zhang SS. 2014. Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. B 76:217–42 [Google Scholar]
/content/journals/10.1146/annurev-economics-012315-015826
Loading
/content/journals/10.1146/annurev-economics-012315-015826
Loading

Data & Media loading...

Supplemental Material

Supplementary Data

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error