Abstract
In the era of climate change, the distribution of climate variables evolves with changes not limited to the mean value. Consequently, clustering algorithms based on central tendency could produce misleading results when used to summarize spatial and/or temporal patterns. We present a novel approach to spatial clustering of time series based on quantiles using a Bayesian framework that incorporates a spatial dependence layer based on a Markov random field. A series of simulations tested the proposal, then applied to the sea surface temperature of the Mediterranean Sea, one of the first seas to be affected by the effects of climate change.
Similar content being viewed by others
References
Amovin-Assagba M, Gannaz I, Jacques J (2022) Outlier detection in multivariate functional data through a contaminated mixture model. Comput Stat Data Anal 174:107496
Benoit DF, Van den Poel D (2017) bayesQR: a Bayesian approach to quantile regression. J Stat Softw 76:1–32
Bera AK, Galvao AF Jr, Montes-Rojas GV, Park SY (2016) Asymmetric Laplace regression: maximum likelihood, maximum entropy and quantile regression. J Econom Methods 5:79–101
Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc Ser B (Methodol) 48:259–279
Bethoux J-P, Gentili B, Raunet J, Tailliez D (1990) Warming trend in the western Mediterranean deep water. Nature 347:660–662
Bondell HD, Reich BJ, Wang H (2010) Noncrossing quantile regression curve estimation. Biometrika 97:825–838
Bouveyron C, Jacques J (2011) Model-based clustering of time series in group-specific functional subspaces. Adv Data Anal Classif 5:281–300
Bouveyron C, Côme E, Jacques J (2015) The discriminative functional mixture model for a comparative analysis of bike sharing systems. Ann Appl Stat 9:1726–1760
Bouzinac C, Font J, Johannessen J (2003) Annual cycles of sea level and sea surface temperature in the western Mediterranean Sea. J Geophys Res Oceans 108(C3):3059
Cade BS, Noon BR (2003) A gentle introduction to quantile regression for ecologists. Front Ecol Environ 1:412–420
Cannon AJ (2018) Non-crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes. Stoch Environ Res Risk Assess 32:3207–3225
Cucala L, Marin J-M, Robert CP, Titterington DM (2009) A Bayesian reassessment of nearest-neighbor classification. J Am Stat Assoc 104:263–273
Cuesta-Albertos JA, Gordaliza A, Matrán C (1997) Trimmed \(k\)-means: an attempt to robustify quantizers. Ann Stat 25:553–576
Cutroneo L, Capello M (2023) The cold waters in the port of Genoa (NW Mediterranean Sea) during the marine heatwave in summer 2022. J Mar Sci Eng 11:1568
de la Hoz CF, Ramos E, Puente A, Méndez F, Menéndez M, Juanes JA, Losada ÍJ (2018) Ecological typologies of large areas. an application in the Mediterranean Sea. J Environ Manage 205:59–72
Delicado P, Giraldo R, Comas C, Mateu J (2010) Statistics for spatial functional data: some recent contributions. Environmetrics 21:224–239
Disegna M, D’Urso P, Durante F (2017) Copula-based fuzzy clustering of spatial time series. Spatial Stat 21:209–225
D’Ortenzio F, Ribera d’Alcalà M (2008) On the trophic regimes of the Mediterranean Sea: a satellite analysis. Biogeosci Discuss 5:139–148
Eilers PH, Gampe J, Marx BD, Rau R (2008) Modulation models for seasonal time series and incidence tables. Stat Med 27:3430–3441
Fritz H, Garcia-Escudero LA, Mayo-Iscar A (2013) A fast algorithm for robust constrained clustering. Comput Stat Data Anal 61:124–136
Gaetan C, Girardi P, Pastres R (2017) Spatial clustering of curves with an application of satellite data. Spat Stat 20:110–124
Gallegos MT (2002) Maximum likelihood clustering with outliers. In: Jajuga K, Sokołowski A, Bock H-H (eds) Classification, clustering, and data analysis. Springer, Berlin, pp 247–255
Garcia-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2015) Avoiding spurious local maximizers in mixture modeling. Stat Comput 25:619–633
Giraldo R, Delicado P, Mateu J (2012) Hierarchical clustering of spatially correlated functional data. Stat Neerl 66:403–421
Grün B, Leisch F (2007) Fitting finite mixtures of generalized linear regressions in r. Comput Stat Data Anal 51:5247–5252
Hu G, Geng J, Xue Y, Sang H (2022) Bayesian spatial homogeneity pursuit of functional data: an application to the us income distribution. Bayesian Anal 1:1–27
Huber PJ (1981) Robust statistics. Wiley, New York
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Ibrahim O, Mohamed B, Nagy H (2021) Spatial variability and trends of marine heat waves in the eastern Mediterranean Sea over 39 years. J Mar Sci Eng 9:643
Jiang H, Serban N (2012) Clustering random curves under spatial interdependence with application to service accessibility. Technometrics 54:108–119
Jiang H, Serban N (2012) Clustering random curves under spatial interdependence with application to service accessibility. Technometrics 54:108–119
Jorgensen B (1982) Statistical properties of the generalized inverse Gaussian distribution. Springer, New York
Katz RW (2010) Statistics of extremes in climate change. Clim Change 100:71–76
Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Kim J, Oh H-S (2020) Pseudo-quantile functional data clustering. J Multivar Anal 178:104626
Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46:33–50
Koenker R, Machado JA (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94:1296–1310
Koner S, Staicu A-M (2023) Second-generation functional data. Annu Rev Stat Appl 10:547–572
Kotz S, Kozubowski T, Podgorski K (2001) The Laplace distribution and generalizations: a revisit with applications to communications, economics, engineering, and finance. Springer, New York
Kozumi H, Kobayashi G (2011) Gibbs sampling methods for Bayesian quantile regression. J Stat Comput Simul 81:1565–1578
Lejeusne C, Chevaldonné P, Pergent-Martini C, Boudouresque CF, Pérez T (2010) Climate change effects on a miniature ocean: the highly diverse, highly impacted Mediterranean Sea. Trends Ecol Evolut 25:250–260
Liao TW (2005) Clustering of time series data-a survey. Pattern Recognit 38:1857–1874
Marin J-M, Pudlo P, Robert CP, Ryder RJ (2012) Approximate Bayesian computational methods. Stat Comput 22:1167–1180
Marjoram P, Molitor J, Plagnol V, Tavaré S (2003) Markov chain Monte Carlo without likelihoods. Proc Natl Acad Sci 100:15324–15328
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Nguyen HD, McLachlan GJ, Ullmann JF, Janke AL (2016) Spatial clustering of time series via mixture of autoregressions models and Markov random fields. Stat Neerl 70:414–439
Nunes S, Perez GL, Latasa M, Zamanillo M, Delgado M, Ortega-Retuerta E, Marrasé C, Simó R, Estrada M (2019) Size fractionation, chemotaxonomic groups and bio-optical properties of phytoplankton along a transect from the Mediterranean Sea to the SW Atlantic Ocean. Sci Marina 83:87–109
Nykjaer L (2009) Mediterranean Sea surface warming 1985–2006. Clim Res 39:11–17
Oliver M, Webster R (1989) A geostatistical basis for spatial weighting in multivariate classification. Math Geol 21:15–35
Pastor F, Valiente JA, Palau JL (2019) Sea surface temperature in the Mediterranean: trends and spatial patterns (1982–2016). Meteorol Climatol Mediterranean Black Seas 175:297–309
Pereyra M, Dobigeon N, Batatia H, Tourneret J-Y (2013) Estimating the granularity coefficient of a Potts-Markov random field within a Markov chain Monte Carlo algorithm. IEEE Trans Image Process 22:2385–2397
Portmann RW, Solomon S, Hegerl GC (2009) Spatial and seasonal patterns in climate change, temperatures, and precipitation across the United States. Proc Natl Acad Sci 106:7324–7329
Potts RB (1952) Some generalized order-disorder transformations. Math Proc Cambridge Philos Soc 48:106–109
Reich BJ (2012) Spatiotemporal quantile regression for detecting distributional changes in environmental processes. J R Stat Soc: Ser C: Appl Stat 61:535–553
Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer, New York
Romary T, Ors F, Rivoirard J, Deraisme J (2015) Unsupervised classification of multivariate geostatistical data: two algorithms. Comput Geosci 85:96–103
Schneider SH (2001) What is ‘dangerous’ climate change? Nature 411:17–19
Secchi P, Vantini S, Vitelli V (2013) Bagging Voronoi classifiers for clustering spatial functional data. Int J Appl Earth Obs Geoinf 22:53–64
Shaltout M, Omstedt A (2014) Recent sea surface temperature trends and future scenarios for the Mediterranean Sea. Oceanologia 56:411–443
Sottile G, Adelfio G (2019) Clusters of effects curves in quantile regression models. Comput Stat 34:551–569
Strauss DJ (1977) Clustering on coloured lattices. J Appl Probab 14:135–143
Sun F, Roderick ML, Farquhar GD (2018) Rainfall statistics, stationarity, and climate change. Proc Natl Acad Sci 115:2305–2310
Vandeskog SM, Thorarinsdottir TL, Steinsland I, Lindgren F (2022) Quantile based modeling of diurnal temperature range with the five-parameter lambda distribution. Environmetrics 33:2719
Vandewalle V, Preda C, Dabo-Niang S (2022) Clustering spatial functional data. In: Mateu J, Giraldo R (eds) Geostatistical functional data analysis. Wiley, New York, pp 155–174
Wang X-F, Xu Y (2017) Fast clustering using adaptive density peak detection. Stat Methods Med Res 26:2800–2811
Watanabe S, Opper M (2010) Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res 11:3571–3594
Zhang M, Parnell A (2023) Review of clustering methods for functional data. ACM Trans Knowl Discov Data 17:1–34
Acknowledgements
The authors would like to thank Noémie Le Carrer for retrieving the dataset.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare.
Code and data availability
Code and data are available upon request from the authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gaetan, C., Girardi, P. & Musau, V.M. Spatial quantile clustering of climate data. Adv Data Anal Classif (2024). https://doi.org/10.1007/s11634-024-00580-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11634-024-00580-y