Optimal model selection in density estimation
Annales de l'I.H.P. Probabilités et statistiques, Tome 48 (2012) no. 3, pp. 884-908.

Une procédure de pénalisation en sélection de modèle repose sur la construction d’une forme pour la pénalité ainsi que sur le choix d’une constante de calibration. Dans cet article, nous étudions, pour le problème d’estimation de la densité, les pénalités obtenues par rééchantillonnage de pénalités idéales. Nous montrons l’efficacité de ces procédures pour l’estimation de la forme des pénalités en prouvant, pour les estimateurs sélectionnés, des inégalités d’oracle fines sans termes résiduelles; les résultats sont valides sous des hypothèses faibles à la fois sur la densité inconnue s et sur les collections de modèles. Ces pénalités sont de plus faciles à calibrer puisque la constante asymptotiquement optimale peut être calculée en fonction des poids de rééchantillonnage. En pratique, le nombre de données est toujours fini, nous étudions donc également l’heuristique de pente et justifions l’algorithme de pente qui permet de calibrer la constante de calibration à partir des données.

In order to calibrate a penalization procedure for model selection, the statistician has to choose a shape for the penalty and a leading constant. In this paper, we study, for the marginal density estimation problem, the resampling penalties as general estimators of the shape of an ideal penalty. We prove that the selected estimator satisfies sharp oracle inequalities without remainder terms under a few assumptions on the marginal density s and the collection of models. We also study the slope heuristic, which yields a data-driven choice of the leading constant in front of the penalty when the complexity of the models is well-chosen.

DOI : 10.1214/11-AIHP425
Classification : 62G07, 62G09
Mots clés : density estimation, optimal model selection, resampling methods, slope heuristic
@article{AIHPB_2012__48_3_884_0,
     author = {Lerasle, Matthieu},
     title = {Optimal model selection in density estimation},
     journal = {Annales de l'I.H.P. Probabilit\'es et statistiques},
     pages = {884--908},
     publisher = {Gauthier-Villars},
     volume = {48},
     number = {3},
     year = {2012},
     doi = {10.1214/11-AIHP425},
     mrnumber = {2976568},
     zbl = {1244.62052},
     language = {en},
     url = {http://www.numdam.org/articles/10.1214/11-AIHP425/}
}
TY  - JOUR
AU  - Lerasle, Matthieu
TI  - Optimal model selection in density estimation
JO  - Annales de l'I.H.P. Probabilités et statistiques
PY  - 2012
SP  - 884
EP  - 908
VL  - 48
IS  - 3
PB  - Gauthier-Villars
UR  - http://www.numdam.org/articles/10.1214/11-AIHP425/
DO  - 10.1214/11-AIHP425
LA  - en
ID  - AIHPB_2012__48_3_884_0
ER  - 
%0 Journal Article
%A Lerasle, Matthieu
%T Optimal model selection in density estimation
%J Annales de l'I.H.P. Probabilités et statistiques
%D 2012
%P 884-908
%V 48
%N 3
%I Gauthier-Villars
%U http://www.numdam.org/articles/10.1214/11-AIHP425/
%R 10.1214/11-AIHP425
%G en
%F AIHPB_2012__48_3_884_0
Lerasle, Matthieu. Optimal model selection in density estimation. Annales de l'I.H.P. Probabilités et statistiques, Tome 48 (2012) no. 3, pp. 884-908. doi : 10.1214/11-AIHP425. http://www.numdam.org/articles/10.1214/11-AIHP425/

[1] H. Akaike. Statistical predictor identification. Ann. Inst. Statist. Math. 22 (1970) 203-217. | MR | Zbl

[2] H. Akaike. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971) 267-281. Akadémiai Kiadó, Budapest, 1973. | MR | Zbl

[3] S. Arlot. Resampling and model selection. Ph.D. thesis, Université Paris-Sud 11, 2007.

[4] S. Arlot. Model selection by resampling penalization. Electron. J. Stat. 3 (2009) 557-624. | MR

[5] S. Arlot and P. Massart. Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245-279.

[6] A. Barron, L. Birgé and P. Massart. Risk bounds for model selection via penalization. Probab. Theory Related Fields 113(3) (1999) 301-413. | MR | Zbl

[7] J.-P. Baudry, K. Maugis and B. Michel. Slope heuristics: Overview and implementation. Report INRIA, 2010. Available at http://hal.archives-ouvertes.fr/hal-00461639/fr/.

[8] L. Birgé. Model selection for density estimation with l 2 -loss. Preprint, 2008.

[9] L. Birgé and P. Massart. From model selection to adaptive estimation. In Festschrift for Lucien Le Cam 55-87. Springer, New York, 1997. | MR | Zbl

[10] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related Fields 138(1-2) (2007) 33-73. | MR | Zbl

[11] O. Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes. C. R. Math. Acad. Sci. Paris 334(6) (2002) 495-500. | MR | Zbl

[12] F. Bunea, A. B. Tsybakov and M. H. Wegkamp. Sparse density estimation with 1 penalties. In Learning Theory 530-543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007. | MR | Zbl

[13] A. Célisse. Density estimation via cross validation: Model selection point of view. Preprint, 2008. Available at arXiv.org:08110802.

[14] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian and D. Picard. Density estimation by wavelet thresholding. Ann. Statist. 24(2) (1996) 508-539. | MR | Zbl

[15] B. Efron. Bootstrap methods: Another look at the jackknife. Ann. Statist. 7(1) (1979) 1-26. | MR | Zbl

[16] I. Gannaz and O. Wintenberger. Adaptive density estimation under dependence. ESAIM Probab. Stat. 14 (2010) 151-172. | Numdam | MR | Zbl

[17] C. Houdré and P. Reynaud-Bouret. Exponential inequalities, with constants, for U-statistics of order two. In Stochastic Inequalities and Applications 55-69. Progr. Probab. 56. Birkhäuser, Basel, 2003. | MR | Zbl

[18] T. Klein and E. Rio. Concentration around the mean for maxima of empirical processes. Ann. Probab. 33(3) (2005) 1060-1077. | MR | Zbl

[19] C. L. Mallows. Some comments on c p . Technometrics 15 (1973) 661-675. | Zbl

[20] P. Massart. Concentration Inequalities and Model Selection. Lecture Notes in Mathematics 1896. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6-23, 2003, With a foreword by Jean Picard. | MR | Zbl

[21] P. Rigollet. Adaptive density estimation using the blockwise Stein method. Bernoulli 12(2) (2006) 351-370. | MR | Zbl

[22] P. Rigollet and A. B. Tsybakov. Linear and convex aggregation of density estimators. Math. Methods Statist. 16(3) (2007) 260-280. | MR | Zbl

[23] M. Rudemo. Empirical choice of histograms and kernel density estimators. Scand. J. Stat. 9(2) (1982) 65-78. | MR | Zbl

[24] G. Schwarz. Estimating the dimension of a model. Ann. Statist. 6 (1978) 461-464. | MR | Zbl

[25] M. Stone. Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. Ser. B Stat. Methodol. 36 (1974) 111-147. With discussion by G. A. Barnard, A. C. Atkinson, L. K. Chan, A. P. Dawid, F. Downton, J. Dickey, A. G. Baker, O. Barndorff-Nielsen, D. R. Cox, S. Giesser, D. Hinkley, R. R. Hocking and A. S. Young and with a reply by the authors. | MR | Zbl

[26] M. Talagrand. New concentration inequalities in product spaces. Invent. Math. 126(3) (1996) 505-563. | MR | Zbl

Cité par Sources :