Adaptive density estimation for clustering with gaussian mixtures
ESAIM: Probability and Statistics, Tome 17 (2013) , pp. 698-724.

Gaussian mixture models are widely used to study clustering problems. These model-based clustering methods require an accurate estimation of the unknown data density by Gaussian mixtures. In Maugis and Michel (2009), a penalized maximum likelihood estimator is proposed for automatically selecting the number of mixture components. In the present paper, a collection of univariate densities whose logarithm is locally β-Hölder with moment and tail conditions are considered. We show that this penalized estimator is minimax adaptive to the β regularity of such densities in the Hellinger sense.

DOI : https://doi.org/10.1051/ps/2012018
Classification : 62G07,  62G20
Mots clés : rate adaptive density estimation, gaussian mixture clustering, hellinger risk, non asymptotic model selection
@article{PS_2013__17__698_0,
     author = {Maugis-Rabusseau, C. and Michel, B.},
     title = {Adaptive density estimation for clustering with gaussian mixtures},
     journal = {ESAIM: Probability and Statistics},
     pages = {698--724},
     publisher = {EDP-Sciences},
     volume = {17},
     year = {2013},
     doi = {10.1051/ps/2012018},
     mrnumber = {3126158},
     language = {en},
     url = {http://www.numdam.org/articles/10.1051/ps/2012018/}
}
Maugis-Rabusseau, C.; Michel, B. Adaptive density estimation for clustering with gaussian mixtures. ESAIM: Probability and Statistics, Tome 17 (2013) , pp. 698-724. doi : 10.1051/ps/2012018. http://www.numdam.org/articles/10.1051/ps/2012018/

[1] J.-P. Baudry, C. Maugis and B. Michel, Slope heuristics: overview and implementation. Stat. Comput. 22 (2011) 455-470. | MR 2865029

[2] L. Birgé, A new lower bound for multiple hypothesis testing. IEEE Trans. Inform. Theory. 51 (2005) 1611-1615. | MR 2241522 | Zbl 1283.62030

[3] W. Cheney and W. Light, A course in approximation theory, Graduate Studies in Mathematics, vol. 101 of Amer. Math. Soc. Providence, RI (2009). | MR 2474372 | Zbl 1167.41001

[4] S. Ghosal, J.K. Ghosh and R.V. Ramamoorthi, Posterior consistency of Dirichlet mixtures in density estimation. Ann. Stat. 27 (1999) 143-158. | MR 1701105 | Zbl 0932.62043

[5] S. Ghosal and A. Van Der Vaart, Entropy and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Stat. 29 (2001) 1233-1263,. | MR 1873329 | Zbl 1043.62025

[6] S. Ghosal and A. Van Der Vaart, Posterior convergence rates of Dirichlet mixtures at smooth densities. Ann. Stat. 35 (2007) 697-723. | MR 2336864 | Zbl 1117.62046

[7] U. Grenander, Abstract inference. John Wiley and Sons Inc., New York (1981). | MR 599175 | Zbl 0505.62069

[8] T. Hangelbroek and A. Ron, Nonlinear approximation using Gaussian kernels. J. Functional Anal. 259 (2010) 203-219. | MR 2610384 | Zbl 1203.41015

[9] J.A. Hartigan, Clustering algorithms, Probab. Math. Stat. John Wiley and Sons, New York-London-Sydney (1975). | MR 405726 | Zbl 0372.62040

[10] T. Hastie, R. Tibshirani and J. Friedman, The elements of statistical learning, Data mining, inference, and prediction. Statistics. Springer, New York, 2nd edition (2009). | MR 2722294 | Zbl 1273.62005

[11] W. Kruijer, J. Rousseau and A van der Vaart, Adaptive Bayesian Density Estimation with Location-Scale Mixtures. Electron. J. Statist. 4 (2010) 1225-1257. | MR 2735885

[12] B. Lindsay, Mixtures Models: Theory, Geometry and Applications. IMS, Hayward, CA (1995). | Zbl 1163.62326

[13] P. Massart, Concentration Inequalities and Model Selection. École d'été de Probabilités de Saint-Flour, 2003. Lect. Notes Math. Springer (2007). | MR 2319879 | Zbl 1170.60006

[14] C. Maugis and B. Michel, Adaptive density estimation for clustering with Gaussian mixtures (2011). arXiv:1103.4253v2.

[15] C. Maugis and B. Michel, Data-driven penalty calibration: a case study for Gaussian mixture model selection. ESAIM: PS 15 (2011) 320-339. | Numdam | MR 2870518

[16] C. Maugis and B. Michel, A non asymptotic penalized criterion for Gaussian mixture model selection. ESAIM: PS 15 (2011) 41-68. | Numdam | MR 2870505

[17] G. Mclachlan and D. Peel, Finite Mixture Models. Wiley (2000). | MR 1789474 | Zbl 0963.62061

[18] A.B. Tsybakov, Introduction to nonparametric estimation. Statistics. Springer, New York (2009). | MR 2724359 | Zbl 1029.62034

[19] J. Wolfowitz, Minimax estimation of the mean of a normal distribution with known variance. Ann. Math. Stat. 21 (1950) 218-230. | MR 35950 | Zbl 0038.09801