Skip to main content

Advertisement

Log in

Similarity, dissimilarity and exceptionality: generalizing Gini’s transvariation to measure “differentness” in many distributions

  • Published:
METRON Aims and scope Submit manuscript

Abstract

Following the work of Gini, Dagum and Tukey, this paper extends Gini’s Transvariation measure for comparing two distributions to the simultaneous comparison of many distributions. In so doing, it develops measures of absolute and relative similarity, dissimilarity and exceptionality together with techniques for assessing particular aspects of variations across those distributions. These techniques are exemplified in a study of differences between the income distributions of males and females drawn from Metis, Inuit, North American Indian and Non-Aboriginal constituencies in Canada in the first decade of the twenty-first century. While the distributions were becoming increasingly similar (interpreted as improving equality of opportunity), this was occurring primarily at the center of the distribution. At the extremes, the distributions were diverging, suggesting that such improvements in equality of opportunity were not for all.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The constituency principle avers that the goodness of alternative states should be judged from the point of view of some identified constituency of individuals who alone are judged to be the relevant and interested parties to the outcome of the comparison exercise.

  2. Note also the close relationship to Hellinger Distance [39, 49]:

    $$\begin{aligned} H^{2}\left( {f,g} \right)= & {} \frac{1}{2}\int \left( {\sqrt{f(x)}-\sqrt{g(x)}} \right) ^{2}dx=1-\int \sqrt{f(x)g(x)}dx\\= & {} 1-\int \sqrt{\frac{f(x)}{g(x)}}g(x)dx=\int \left( {1-\sqrt{\frac{f(x)}{g(x)}}} \right) g(x)dx\\= & {} E_{g(x)} \left( {1-\sqrt{\frac{f(x)}{g(x)}}} \right) \end{aligned}$$
  3. This should not be confused with standardizing by a location measure, see for example [28].

  4. It is interesting to note that Tukey [57] proposed visualizing distributions in terms of a “Rootgram” since it increased the visual importance of low frequency outcomes and muted the importance of high frequency outcomes. Thus rescaling can be seen as viewing distributions relative to the Rootgram of the target distribution standardized to make it conformable with a regular probability distribution. See also [37].

  5. A similar algebra can readily demonstrate the relationship in the discrete paradigm.

  6. Yalonetsky [60] in comparing the Overlap with the Pearson Measure can be construed as comparing the performance of the two moment estimators.

  7. For this to be the case f(x) and g(x) have to be such that:

    $$\begin{aligned} E_g \left( {\left( {\frac{f(x)}{g(x)}} \right) ^{2}} \right) -2E_g \left( {\left( {\frac{f(x)}{g(x)}} \right) } \right) =E_f \left( {\left( {\frac{g(x)}{f(x)}} \right) ^{2}} \right) -2E_f \left( {\left( {\frac{g(x)}{f(x)}} \right) } \right) \end{aligned}$$

    A sufficient condition for this is that f(x) and g(x) should be reflective of each other around some point \(x^{*}\) so that \(f( {x^{*}+\delta } ) = g( {x^{*}-\delta } )\) for all \(\delta \) since then for every \(\frac{f( {x_1 } )}{g( {x_1 } )}\) there will be a corresponding \(\frac{g( {x_2 } )}{f( {x_2 } )}\) of equal value with the same importance weight in the calculation.

  8. Dagum [14, 15] first mooted the idea of extending Gini’s Transvariation to many distributions in a discretized paradigm.

  9. For example, in the equality of opportunity literature, concern is for inequality of outcome distributions across inheritance classes where the concern may be differences from the median inheritance class outcome distribution or the richest or poorest inheritance class distributions.

  10. All of the above indices can be shown to satisfy piecewise continuity, scale invariance, scale independence and normalization axioms for inequality indices. See for example [42, 54, 55].

  11. Also related here is the Integrated Squared Difference approach [38].

  12. This is the condition used in the finance literature for risk loving behavior.

  13. These measures can be shown to be the average polarization distance between all pairs of observations [3].

  14. In this paper, we maintain the terminology used by Statistics Canada in their codebooks to identify Aboriginal individuals belonging to these three groups.

  15. These policies include, for example, the Affordable Housing Initiative (2001–2007), which helped fund the “construction and renovation of affordable housing units” [9, p. 67] and the Urban Aboriginal Strategy aimed at improving the socio-economic status of urban Aboriginals [9]. Other policies include Aboriginal Head Start in Urban and Northern Communities [41], Aboriginal Head Start on Reserve, and the First Nations and Inuit Child Care Initiative [40].

  16. As noted by Feir and Hancock [25], some caution is required when using Aboriginal data from the Censuses and the NHS. First, a number of reserves in Canada are not enumerated. Second, the structure of the ethnic origin question has changed a number of times. Finally, there is the potential impact of intra-generational ethnic mobility. Each plausibly causing exogenous variation in the size and characteristics of the Aboriginal population.

  17. The incomes reported in the 2001, 2006, and 2011 surveys are based on individual earnings in 2000, 2005, and 2010, respectively.

References

  1. Anderson, T., Goodman, L.: Statistical inference about Markov chains. Ann. Math. Stat. 28(1), 89–110 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  2. Anderson, G., Linton, O., Whang, Y.-J.: Nonparametric estimation and inference about the overlap of two distributions. J. Econom. 171(1), 1–23 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Anderson, G.: Polarization measurement and inference in many dimensions when subgroups can not be identified. Economics: The Open-Access, Open-Assessment E-Journal. 5(2011–11), 1–19 (2011). doi:10.5018/economics-ejournal.ja.2011-11

  4. Anderson, G.J., Leo, T.W.: On providing a complete ordering of non-combinable alternative prospects. Mimeo University of Toronto Discussion Paper (2016)

  5. Anderson, G., Post, T., Whang, Y.-J.: Somewhere between Utopia and Dystopia: choosing from multiple incomparable prospects. University of Toronto Economics Discussion paper (2017)

  6. Anderson, G.J., Ge, Y., Leo, T.W.: Distributional overlap: simple, multivariate, parametric and non-parametric tests for alienation, convergence and general distributional difference issues. Econom. Rev. 29, 247–275 (2009)

    Article  MATH  Google Scholar 

  7. Ballester, C., Vorsatz, M.: Random-walk-based segregation measures. Rev. Econ. Stat. 96(3), 383–401 (2014). doi:10.1162/RESTa00399

    Article  Google Scholar 

  8. Blackorby, C., Bossert, W., Donaldson, D.: Quasi orderings and population ethics. Soc. Choice Welf. 13, 129–150 (1996)

    Article  MATH  Google Scholar 

  9. Bonesteel, S.: Canada’s relationship with Inuit: a history of policy and program development. In: Anderson, E. (ed.) Prepared for Indian and Northern Affairs Canada (2006)

  10. Brieman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)

    Google Scholar 

  11. Broome, J.: The welfare economics of population. Oxf. Econ. Pap. 48, 177–193 (1996)

    Article  Google Scholar 

  12. Calò, D.: On a transvariation based measure of group separability. J. Classif. 23, 143–167 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  13. D’Ambrosio, C., Bossert, W., La Ferrara, E.: A generalized index of fractionalization. Economica 78, 723–750 (2011)

    Article  Google Scholar 

  14. Dagum, C.: Nonparametric and Gaussian Transvariation Theory: Its Economic Applications. Econometric Research Program, Princeton University, Research Memorandum 99 (1968)

  15. Dagum, C.: Multivariate Transvariation Theory Among Several Distributions and Its Economic Applications. Econometric Research Program, Princeton University, Research Memorandum 100 (1968)

  16. Dasgupta, P.: Lives and wellbeing. Soc. Choice Welf. 5, 103–126 (1988)

    Article  MathSciNet  Google Scholar 

  17. Dasgupta, P.: Savings and fertility: ethical issues. Philos. Public Aff. 23, 99–127 (1994)

    Article  Google Scholar 

  18. Deutsch, J., Silber, J.: Analyzing the Impact of Income Sources on Changes in Polarization. Mimeo, New York (2008)

    Google Scholar 

  19. Deutsch, J., Silber, J.: On the Decomposition of Income Polarization by Population Subgroups. Mimeo, New York (2008)

    Google Scholar 

  20. Duclos, J.-Y., Esteban, J., Ray, D.: Polarization: concepts, measurement, estimation. Econometrica 72, 1737–1772 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  21. Echenique, F., Fryer Jr., R.G.: A measure of segregation based on social interactions. Q. J. Econ. 122(2), 441–485 (2007)

    Article  Google Scholar 

  22. Esteban, J., Ray, D.: Polarization, fractionalization and conflict. J. Peace Res. 45, 163–182 (2008)

    Article  Google Scholar 

  23. Esteban, J., Schneider, G.: Polarization and conflict: theoretical and empirical issues (introduction to the special issue). J. Peace Res. 45, 131–141 (2008)

    Article  Google Scholar 

  24. Fearon, J., Laitin, D.: Ethnicity, insurgency, and civil war. Am. Polit. Sci. Rev. 97, 75–90 (2003)

    Article  Google Scholar 

  25. Feir, D., Hancock, R.L.A.: Answering the call: a guide to reconciliation for quantitative social scientists. Can. Public Policy 42(3), 350–365 (2016)

    Article  Google Scholar 

  26. Fields, G.S.: Income mobility. In: Blume, L., Durlauf, S. (eds.) The new Palgrave dictionary of economics. Palgrave-MacMillan (2008)

  27. Flückiger, Y., Silber, J.: The Measurement of Segregation in the Labour Force. Springer Science and Business Media, Berlin (2012)

    Google Scholar 

  28. Foster, J., Shneyerov, A.: Path independent inequality measures. J. Econ. Theory 91(199), 222 (2000)

    MathSciNet  MATH  Google Scholar 

  29. Foster, J., Greer, J., Thorbecke, E.: A class of decomposable poverty measures. Econometrica 52, 761–766 (1984)

    Article  MATH  Google Scholar 

  30. Frankel, D.M., Volij, O.: Measuring school segregation. J. Econ. Theory 146(1), 1–38 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  31. Georgescu-Roegen, N.: The Entropy Law and the Economic Process. Harvard University Press, Cambrige (1971)

    Book  Google Scholar 

  32. Gentzkow, M., Shapiro, J.M.: Ideological segregation online and offline. Q. J. Econ. 126(4), 1799–1839 (2011)

    Article  Google Scholar 

  33. Gentzkow, M., Shapiro, J.M., Taddy, M.: Measuring Polarization in High-Dimensional Data: Method and Application to Congressional Speech. Mimeo, New York (2015)

    Google Scholar 

  34. Gini, C.: Il concetto di transvariazione e le sue prime applicazioni. Giornale degli Economisti e Rivista di Statistica. 5, 1–35 (1916)

    Google Scholar 

  35. Gini, C.: Transvariazione. Libreria Goliardica, Rome (1959)

    MATH  Google Scholar 

  36. Goldthorpe, J.H.: Social Mobility and Class Structure in Modern Britain, 2nd edn. Clarrendon Press, Oxford (1987)

    Google Scholar 

  37. Handcock, M., Morris, M.: Relative Distribution Methods in the Social Sciences. Statistics for Social Science and Public Policy. Springer, Berlin (1999)

    MATH  Google Scholar 

  38. Hall, P., Yatchew, A.: Unified approach to testing functional hypotheses in semiparametric contexts. J. Econom. 127, 225–252 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  39. Hogg, R., Tanis, E.: Probability and Statistical Inference. Prentice Hall, Upper Saddle River (1997)

    MATH  Google Scholar 

  40. Inuit Tapiriit Kanatami [ITK]: Assessing the Impact of the First Nations and Inuit Child Care Initiative Across Inuit Nunangat. Inuit Tapiriit Kanatami (2014)

  41. Kay-Raining Bird, E.: Health, education, language, dialect, and culture in First Nations, Inuit, and Metis communities in Canada: an overview. Can. J. Speech Lang. Pathol. Audiol. 35(2), 110–124 (2011)

    Google Scholar 

  42. Kobus, M., Miło’s, P.: Inequality decomposition by population subgroups for ordinal data. J. Health Econ. 31(2012), 15–21 (2012)

    Article  Google Scholar 

  43. Mele, A.: Poisson indices of segregation. Reg. Sci. Urban Econ. 43, 65–85 (2013)

    Article  Google Scholar 

  44. National Aboriginal Economic Development Board (2012) The Aboriginal Economic Benchmarking Report

  45. National Aboriginal Economic Development Board (2015) The Aboriginal Economic Progress Report 2015

  46. Pearson, K.: On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Mag. Ser. 5 50(302), 157–175 (1900)

    Article  MATH  Google Scholar 

  47. Pendakur, K., Pendakur, R.: Aboriginal income disparity in Canada. Can. Public Policy 37(1), 61–83 (2011)

    Article  Google Scholar 

  48. Penney, C.: Aboriginal data as a result of changes to the 2011 Census of population. Strategic Research, Aboriginal Affairs and Northern Development Canada (2013). https://www.aadnc-aandc.gc.ca/DAM/DAM-INTER-HQ-AI/STAGING/texte-text/rs_re_brief_2011NHS_print_1380028890150_eng.pdf

  49. Pollard, D.E.: A User’s Guide to Measure Theoretic Probability. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  50. Ramos, X., Van de Gaer, D.: Empirical approaches to inequality of opportunity: principles, measures, and evidence. Discussion Paper Series, Forschungsinstitut zur Zukunft der Arbeit, No. 6672 (2014)

  51. Reynal-Querol, M.: Ethnicity, political systems, and civil wars. J. Confl. Resolut. 46, 29–54 (2002)

    Article  Google Scholar 

  52. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  53. Shorrocks, A.F.: Income mobility and the Markov assumption. Econ. J. 46, 566–578 (1976)

    Article  Google Scholar 

  54. Shorrocks, A.F.: The measurement of mobility. Econometrica 46, 1013–1024 (1978)

    Article  MATH  Google Scholar 

  55. Shorrocks, A.F.: Income inequality and income mobility. J. Econ. Theory 46, 376–393 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  56. Theil, H.: Economics and Information Theory. North Holland, Amsterdam (1967)

    Google Scholar 

  57. Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977)

    MATH  Google Scholar 

  58. Van de Gaer, D., Schokkaert, E., Martinez, M.: Three meanings of intergenerational mobility. Economica 68(272), 519–537 (2001)

    Article  Google Scholar 

  59. Weinstein, J.: Quiet Revolution West: The Rebirth of Metis Nationalism. Fifth House Publishers, Calgary (2007)

    Google Scholar 

  60. Yalonetzky, G.: A comparison between the Pearson-based dissimilarity index and the multiple-group overlap index. OPHI discussion paper series 16a (2010)

  61. Yitzahki, S.: Economic distance and overlapping of distributions. J. Econom. 61, 147–159 (1994)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gordon Anderson.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anderson, G., Linton, O. & Thomas, J. Similarity, dissimilarity and exceptionality: generalizing Gini’s transvariation to measure “differentness” in many distributions. METRON 75, 161–180 (2017). https://doi.org/10.1007/s40300-017-0112-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40300-017-0112-4

Keywords

Navigation