Clustering via nonparametric density estimation

Azzalini, Adelchi; Torelli, Nicola

doi:10.1007/s11222-006-9010-y

Clustering via nonparametric density estimation

Published: 03 February 2007

Volume 17, pages 71–80, (2007)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Adelchi Azzalini¹ &
Nicola Torelli²

1140 Accesses
82 Citations
Explore all metrics

Abstract

Although Hartigan (1975) had already put forward the idea of connecting identification of subpopulations with regions with high density of the underlying probability distribution, the actual development of methods for cluster analysis has largely shifted towards other directions, for computational convenience. Current computational resources allow us to reconsider this formulation and to develop clustering techniques directly in order to identify local modes of the density. Given a set of observations, a nonparametric estimate of the underlying density function is constructed, and subsets of points with high density are formed through suitable manipulation of the associated Delaunay triangulation. The method is illustrated with some numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aitchison J. 1986. The Statistical Analysis of Compositional Data. Chapman & Hall, London.
MATH Google Scholar
Ankerst M., Breuning M.M., Kriegel H.P., and Sander J. 1999. OPTICS: ordering points to identify the clustering structure. In: International Conference on Management of Data (SIGMOD’99), ACM, pp. 49–60.
Barber C.B., Dobkin D.P., and Huhdanpaa H. 1996. The Quickhull algorithm for convex hulls. ACM Trans. Math. Software 22: 469–483.
Article MATH MathSciNet Google Scholar
Bowman A. and Foster P. 1993. Density based exploration of bivariate data. Statistics and Computing 3: 171–177.
Article Google Scholar
Bowman A.W. and Azzalini 1997. Applied Smoothing Techniques for Data Analysis. Claredon Press, Oxford.
Cuevas A., Febrero M., and Fraiman R. 2000. Estimating the number of clusters. Canad. J. Stat. 28: 367–382.
Article MATH MathSciNet Google Scholar
Cuevas A., Febrero M., and Fraiman R. 2001. Cluster analysis: a further approach based on density estimation. Computational Statistics & Data Analysis 36: 441–459.
Article MATH MathSciNet Google Scholar
Devroye L.P. and Wagner T.J. 1980. The strong uniform consistency of kernel density estimates. In: Multivariate Analysis, North-Holland, Vol. 5, pp. 59–77.
Ester M., Kriegel H.P., Sander J., and Xu X. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery in Data Mining (KDD-96), Portland, OR, USA. ACM, pp. 226–231.
Forina M., Armanino C., Lanteri S., and Tiscornia E. 1983. Classication of olive oils from their fatty acid composition. In: H. Martens and H. J. Russwurm (Eds.), Food Research and Data Analysis, Applied Science Publishers: London, pp. 189–214.
Hartigan J.A. 1975. Clustering Algorithms. J. Wiley & Sons, New York.
MATH Google Scholar
Hubert L. and Arabie P. 1985. Comparing partitions. Journal of Classification 2: 193–218.
Article Google Scholar
Nadaraya É.A. 1965. On non-parametric estimates of density functions and regression curves. Theory Probability its Appl. (Transl. Teorija Verojatnostei i ee Primenenija) 10: 186–190.
Okabe A., Boots B.N., and Sugihara K. 1992. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. J. Wiley & Sons, New York.
R Development Core Team 2004. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria 3-900051-07-0.
Rosolin T., Azzalini A., and Torelli N. 2003. Detecting clusters via nonparametric density estimation. In: Convegno SIS analisi statistica multivariata per le scienze economico-sociali, le scienze naturali e la tecnologia, Napoli, Italy. Società Italiana di Statistica, RCE edizioni.
Stuetzle W. 2003. Estimating the cluster tree of a density by analyzing the minimal spanning tree of a sample. Journal of Classification 20: 25–47.
Article MATH MathSciNet Google Scholar
Wong A.M. and Lane T. 1983. The kth nearest neighbour clustering procedure. Journal of the Royal Statistical Society, Series B 45: 362–368.
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Scienze Statistiche, Università di Padova, Padova, Italy
Adelchi Azzalini
Dipartimento di Scienze Economiche e Statistiche, Università di Trieste, Trieste, Italy
Nicola Torelli

Authors

Adelchi Azzalini
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Torelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adelchi Azzalini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Azzalini, A., Torelli, N. Clustering via nonparametric density estimation. Stat Comput 17, 71–80 (2007). https://doi.org/10.1007/s11222-006-9010-y

Download citation

Published: 03 February 2007
Issue Date: March 2007
DOI: https://doi.org/10.1007/s11222-006-9010-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering via nonparametric density estimation

Abstract

Access this article

Similar content being viewed by others

Density-based clustering with non-continuous data

Density-Based Clustering

Density-Based Clustering Based on Hierarchical Density Estimates

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering via nonparametric density estimation

Abstract

Access this article

Similar content being viewed by others

Density-based clustering with non-continuous data

Density-Based Clustering

Density-Based Clustering Based on Hierarchical Density Estimates

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation