Adaptive Dantzig density estimation
Annales de l'I.H.P. Probabilités et statistiques, Volume 47 (2011) no. 1, pp. 43-74.

The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao's approach, we propose a minimization of the 1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.

L'objectif de cet article est de construire un estimateur d'une densité inconnue comme combinaison linéaire de fonctions d'un dictionnaire. Inspirés par l'approche de Candès et Tao, nous proposons une minimisation de la norme 1 des coefficients dans la combinaison linéaire sous une contrainte de Dantzig adaptative issue d'inégalités de concentration précises. Ceci nous permet de considérer une large classe de dictionnaires. Sous des hypothèses de structure locale ou globale, nous obtenons des inégalités oracles. Ces résultats théoriques sont transposés à l'estimateur Lasso adaptatif naturellement associé à notre procédure de Dantzig. Le problème de la calibration de ces procédures est alors étudié à la fois du point de vue théorique et du point de vue pratique. Enfin, une étude numérique montre l'amélioration significative obtenue par notre procédure en comparaison d'autres procédures plus classiques.

DOI: 10.1214/09-AIHP351
Classification: 62G07, 62G05, 62G20
Keywords: calibration, concentration inequalities, Dantzig estimate, density estimation, dictionary, Lasso estimate, oracle inequalities, sparsity
     author = {Bertin, K. and Le Pennec, E. and Rivoirard, V.},
     title = {Adaptive {Dantzig} density estimation},
     journal = {Annales de l'I.H.P. Probabilit\'es et statistiques},
     pages = {43--74},
     publisher = {Gauthier-Villars},
     volume = {47},
     number = {1},
     year = {2011},
     doi = {10.1214/09-AIHP351},
     mrnumber = {2779396},
     zbl = {1207.62077},
     language = {en},
     url = {}
AU  - Bertin, K.
AU  - Le Pennec, E.
AU  - Rivoirard, V.
TI  - Adaptive Dantzig density estimation
JO  - Annales de l'I.H.P. Probabilités et statistiques
PY  - 2011
SP  - 43
EP  - 74
VL  - 47
IS  - 1
PB  - Gauthier-Villars
UR  -
DO  - 10.1214/09-AIHP351
LA  - en
ID  - AIHPB_2011__47_1_43_0
ER  - 
%0 Journal Article
%A Bertin, K.
%A Le Pennec, E.
%A Rivoirard, V.
%T Adaptive Dantzig density estimation
%J Annales de l'I.H.P. Probabilités et statistiques
%D 2011
%P 43-74
%V 47
%N 1
%I Gauthier-Villars
%R 10.1214/09-AIHP351
%G en
%F AIHPB_2011__47_1_43_0
Bertin, K.; Le Pennec, E.; Rivoirard, V. Adaptive Dantzig density estimation. Annales de l'I.H.P. Probabilités et statistiques, Volume 47 (2011) no. 1, pp. 43-74. doi : 10.1214/09-AIHP351.

[1] S. Arlot and P. Massart. Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245-279.

[2] M. S. Asif and J. Romberg. Dantzig selector homotopy with dynamic measurements. In Proceedings of SPIE Computational Imaging VII 7246 (2009) 72460E.

[3] P. Bickel, Y. Ritov and A. Tsybakov. Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705-1732. | MR | Zbl

[4] L. Birgé. Model selection for density estimation with L2-loss, 2008. Available at arXiv 0808.1416.

[5] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related. Fields 138 (2007) 33-73. | MR | Zbl

[6] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation and sparsity via ℓ1 penalized least squares. In Learning Theory 379-391. Lecture Notes in Comput. Sci. 4005. Springer, Berlin, 2006. | MR | Zbl

[7] F. Bunea, A. Tsybakov and M. Wegkamp. Sparse density estimation with ℓ1 penalties. Learning Theory 530-543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007. | Zbl

[8] F. Bunea, A. Tsybakov and M. Wegkamp. Sparsity oracle inequalities for the LASSO. Electron. J. Statist. 1 (2007) 169-194. | MR | Zbl

[9] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation for Gaussian regression. Ann. Statist. 35 (2007) 1674-1697. | MR | Zbl

[10] F. Bunea, A. Tsybakov, M. Wegkamp and A. Barbu. Spades and mixture models. Ann. Statist. (2010). To appear. Available at arXiv 0901.2044. | MR | Zbl

[11] F. Bunea. Consistent selection via the Lasso for high dimensional approximating regression models. In Pushing the Limits of Contemporary Statistics: Cartributions in Honor of J. K. Ghosh 122-137. Inst. Math. Stat. Collect 3. IMS, Beachwood, OH, 2008. | MR

[12] E. Candès and Y. Plan. Near-ideal model selection by ℓ1 minimization. Ann. Statist. 37 (2009) 2145-2177. | MR | Zbl

[13] E. Candès and T. Tao. The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 (2007) 2313-2351. | MR | Zbl

[14] D. Chen, D. Donoho and M. Saunders. Atomic decomposition by basis pursuit. SIAM Rev. 43 (2001) 129-159. | MR | Zbl

[15] D. Donoho, M. Elad and V. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 (2006) 6-18. | MR

[16] D. Donoho and I. Johnstone. Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 (1994) 425-455. | MR | Zbl

[17] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani. Least angle regression. Ann. Statist. 32 (2004) 407-499. | MR | Zbl

[18] A. Juditsky and S. Lambert-Lacroix. On minimax density estimation on ℝ. Bernoulli 10 (2004) 187-220. | MR | Zbl

[19] K. Knight and W. Fu. Asymptotics for Lasso-type estimators. Ann. Statist. 28 (2000) 1356-1378. | MR | Zbl

[20] K. Lounici. Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 (2008) 90-102. | MR

[21] P. Massart. Concentration inequalities and model selection. Lecture Notes in Math. 1896. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour July 6-23 2003, 2007. | MR | Zbl

[22] N. Meinshausen and P. Buhlmann. High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436-1462. | MR | Zbl

[23] N. Meinshausen and B. Yu. Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 (2009) 246-270. | MR | Zbl

[24] M. Osborne, B. Presnell and B. Turlach. On the Lasso and its dual. J. Comput. Graph. Statist. 9 (2000) 319-337. | MR

[25] M. Osborne, B. Presnell and B. Turlach. A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 (2000) 389-404. | MR | Zbl

[26] P. Reynaud-Bouret and V. Rivoirard. Near optimal thresholding estimation of a Poisson intensity on the real line. Electron. J. Statist. 4 (2010) 172-238. | MR

[27] P. Reynaud-Bouret, V. Rivoirard and C. Tuleau. Adaptive density estimation: A curse of support? 2009. Available at arXiv 0907.1794. | Zbl

[28] R. Tibshirani. Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 (1996) 267-288. | MR | Zbl

[29] S. Van De Geer. High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 (2008) 614-645. | MR | Zbl

[30] B. Yu and P. Zhao. On model selection consistency of Lasso estimators. J. Mach. Learn. Res. 7 (2006) 2541-2567. | MR

[31] C. Zhang and J. Huang. The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 (2008) 1567-1594. | MR | Zbl

[32] H. Zou. The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418-1429. | MR | Zbl

Cited by Sources: