On two extensions of the vector quantization scheme
[Sur deux extensions du principe de quantification vectorielle]
Journal de la société française de statistique, Tome 156 (2015) no. 1, pp. 51-75.

Dans cet article, nous présentons des résultats relatifs à deux extensions différentes de la quantification vectorielle et de la question liée de classification par la méthode des centres mobiles. La première partie de l’article concerne la performance théorique de la quantification et du clustering avec des divergences de Bregman ; la seconde est dédiée à des problèmes de sélection de modèle pour les courbes principales. Chaque partie est complétée par quelques illustrations numériques.

In this paper, we present results pertaining to two different extensions of vector quantization and the related question of k - means clustering. The first part of the paper is about the theoretical performance of quantization and clustering with Bregman divergences. The second one is dedicated to model selection issues for principal curves. Some numerical illustrations are provided in each case.

Keywords: quantization, clustering, Bregman divergences, principal curves, model selection
Mot clés : quantification, classification non supervisée, divergences de Bregman, courbes principales, sélection de modèle
@article{JSFS_2015__156_1_51_0,
     author = {Fischer, Aur\'elie},
     title = {On two extensions of the vector quantization scheme},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {51--75},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {156},
     number = {1},
     year = {2015},
     mrnumber = {3338240},
     zbl = {1316.62087},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2015__156_1_51_0/}
}
TY  - JOUR
AU  - Fischer, Aurélie
TI  - On two extensions of the vector quantization scheme
JO  - Journal de la société française de statistique
PY  - 2015
SP  - 51
EP  - 75
VL  - 156
IS  - 1
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2015__156_1_51_0/
LA  - en
ID  - JSFS_2015__156_1_51_0
ER  - 
%0 Journal Article
%A Fischer, Aurélie
%T On two extensions of the vector quantization scheme
%J Journal de la société française de statistique
%D 2015
%P 51-75
%V 156
%N 1
%I Société française de statistique
%U http://www.numdam.org/item/JSFS_2015__156_1_51_0/
%G en
%F JSFS_2015__156_1_51_0
Fischer, Aurélie. On two extensions of the vector quantization scheme. Journal de la société française de statistique, Tome 156 (2015) no. 1, pp. 51-75. http://www.numdam.org/item/JSFS_2015__156_1_51_0/

[1] Alber, Y.; Butnariu, D. Convergence of Bregman projection methods for solving consistent convex feasibility problems in reflexive Banach spaces, Journal of Optimization Theory and Applications, Volume 92 (1997), pp. 33-61 | MR | Zbl

[2] Art, D.; Gnanadesikan, R.; Kettenring, J. R. Data-based metrics for cluster analysis, Utilitas Mathematica, Volume 21A (1982), pp. 75-99 | MR | Zbl

[3] Alcorn, T. M.; Hoggar, C. W. Preprocessing of data for character recognition, Marconi Review (1969), pp. 61-81

[4] Akaike, H. Information theory and an extension of the maximum likelihood principle, Proceedings of the 2nd International Symposium on Information Theory (1973), pp. 267-281 | MR | Zbl

[5] Arlot, S.; Massart, P. Data-driven calibration of penalties for least-squares regression, Journal of Machine Learning Research, Volume 10 (2009), pp. 245-279

[6] Alexandrov, A. D.; Reshetnyak, Y. G. General Theory of Irregular Curves, Mathematics and its Applications, Kluwer Academic Publishers, Dordrecht, 1989 | MR | Zbl

[7] Bauschke, H. H.; Borwein, J. M.; Combettes, P. L. Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces, Communications in Contemporary Mathematics, Volume 3 (2001), pp. 615-647 | MR | Zbl

[8] Barron, A.; Birgé, L.; Massart, P. Risk bounds for model selection via penalization, Probability Theory and Related Fields, Volume 113 (1999), pp. 301-413 | MR | Zbl

[9] Biau, G.; Devroye, L.; Lugosi, G. On the performance of clustering in Hilbert spaces, IEEE Transactions on Information Theory, Volume 54 (2008), pp. 781-790 | MR | Zbl

[10] Biau, G.; Fischer, A. Parameter selection for principal curves, IEEE Transactions on Information Theory, Volume 58 (2012), pp. 1924-1939 | MR | Zbl

[11] Banerjee, A.; Guo, X.; Wang, H. On the optimality of conditional expectation as a Bregman predictor, IEEE Transactions on Information Theory, Volume 51 (2005) | MR | Zbl

[12] Birgé, L.; Massart, P. Gaussian model selection, Journal of the European Mathematical Society, Volume 3 (2001), pp. 203-268 | MR | Zbl

[13] Birgé, L.; Massart, P. Minimal penalties for Gaussian model selection, Probability Theory and Related Fields, Volume 138 (2007), pp. 33-73 | MR | Zbl

[14] Birgé, L.; Massart, P. From model selection to adaptive estimation, Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics (Pollard, D.; Torgersen, E.; Yang, G., eds.), Springer, New York, 1997, pp. 55-87 | MR | Zbl

[15] Banerjee, A.; Merugu, S.; Dhillon, I. S.; Ghosh, J. Clustering with Bregman divergences, Journal of Machine Learning Research, Volume 6 (2005), pp. 1705-1749 | MR | Zbl

[16] Baudry, J.-P.; Maugis, C.; Michel, B. Slope heuristics: overview and implementation, Statistics and Computing, Volume 22 (2012), pp. 455-470 | MR | Zbl

[17] Bregman, L. M. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Computational Mathematics and Mathematical Physics, Volume 7 (1967), pp. 200-217 | MR | Zbl

[18] Brunsdon, C. Path estimation from GPS tracks, Proceedings of the 9th International Conference on GeoComputation, National Centre for Geocomputation, National University of Ireland, Maynooth, Eire (2007)

[19] Cesa-Bianchi, N.; Lugosi, G. Prediction, Learning, and Games, Cambridge University Press, New York, 2006 | MR | Zbl

[20] Caillerie, C.; Michel, B. Model selection for simplicial approximation, Foundations of Computational Mathematics, Volume 11 (2011), pp. 707-731 | MR | Zbl

[21] Csiszár, I. Generalized projections for non-negative functions, Acta Mathematica Hungarica, Volume 68 (1995), pp. 161-185 | MR | Zbl

[22] Deutsch, E. S. Preprocessing for character recognition, Proceedings of the IEE–NPL Conference on Pattern Recognition (1968), pp. 179-190

[23] Engdahl, E. R.; Villaseñor, A. Global seismicity: 1900–1999, International Handbook of Earthquake and Engineering Seismology (Lee, W.H.K.; Kanamori, H.; Jennings, P.C.; Kisslinger, C., eds.), Academic Press, London, 2002, pp. 665-690

[24] Fischer, A. Quantization and clustering with Bregman divergences, Journal of Multivariate Analysis, Volume 101 (2010), pp. 2207-2221 | MR | Zbl

[25] Fischer, A. Selecting the length of a principal curve within a Gaussian model, Electronic Journal of Statistics, Volume 7 (2013), pp. 342-363 | MR | Zbl

[26] Fischer, A. Deux méthodes d’apprentissage non supervisé: synthèse sur la méthode des centres mobiles et présentation des courbes principales, Journal de la Société Française de Statistique, Volume 155 (2014), pp. 2-35 | Numdam | MR | Zbl

[27] Frigyik, B. A.; Srivastava, S.; Gupta, M. R. An introduction to functional derivatives (2008) no. UWEETR-2008-0001 (Technical report)

[28] Frigyik, B. A.; Srivastava, S.; Gupta, M. R. Functional Bregman divergence and Bayesian estimation of distributions, IEEE Transactions on Information Theory, Volume 54 (2008), pp. 5130-5139 | MR | Zbl

[29] Gray, R. M.; Buzo, A.; Gray, A. H.; Matsuyama, Y. Distortion measures for speech processing, IEEE Transactions on Acoustics, Speech and Signal Processing, Volume 28 (1980), pp. 367-376 | Zbl

[30] Gersho, A.; Gray, R. M. Vector Quantization and Signal Compression, Kluwer Academic Publishers, Norwell, 1992 | Zbl

[31] Graf, S.; Luschgy, H. Foundations of Quantization for Probability Distributions, Lecture Notes in Mathematics, Springer-Verlag, Berlin, Heidelberg, 2000 | MR | Zbl

[32] Hastie, T.; Stuetzle, W. Principal curves, Journal of the American Statistical Association, Volume 84 (1989), pp. 502-516 | MR | Zbl

[33] Jones, L.; Byrne, C. General entropy criteria for inverse problems, with applications to data compression, pattern classification, and cluster analysis, IEEE Transactions on Information Theory, Volume 36 (1990) | MR | Zbl

[34] Jordan, M. I.; Nguyen, X.; Wainwright, M. J. Estimating divergence functionals and the likelihood ratio by convex risk minimization, IEEE Transactions on Information Theory, Volume 56 (2010), pp. 5847-5861 | MR | Zbl

[35] Kolmogorov, A. N.; Fomin, S. V. Introductory Real Analysis, Dover Publications, Mineola, 1975 | MR

[36] Kégl, B.; Krzyżak, A.; Linder, T.; Zeger, K. Learning and design of principal curves, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 22 (2000), pp. 281-297

[37] Kruskal, J. On the shortest spanning subtree of a graph and the traveling salesman problem, Proceedings of the American Mathematical Society, Volume 7 (1956), pp. 48-50 | MR | Zbl

[38] Kégl, B. Principal Curves: Learning, Design, and Applications, Concordia University, Montréal, Québec, Canada (1999) (Ph. D. Thesis)

[39] Lebarbier, E. Detecting multiple change-points in the mean of Gaussian process by model selection, Signal Processing, Volume 85 (2005), pp. 717-736 | Zbl

[40] Lerasle, M. Optimal model selection in density estimation, Annales de l’Institut Henri Poincaré, Volume 48 (2012), pp. 884-908 | MR | Zbl

[41] Linder, T. Learning-theoretic methods in vector quantization, Principles of Nonparametric Learning (Györfi, L., ed.), Springer-Verlag, Wien, 2002 | MR

[42] Lloyd, S. P. Least squares quantization in PCM, IEEE Transactions on Information Theory, Volume 28 (1982), pp. 129-137 | MR | Zbl

[43] Mallows, C. L. Some comments on C p , Technometrics, Volume 15 (1973), pp. 661-675 | Zbl

[44] Massart, P. Concentration Inequalities and Model Selection (Picard, Jean, ed.), Ecole d’Eté de Probabilités de Saint-Flour XXXIII – 2003, Lecture Notes in Mathematics, Springer, Berlin, Heidelberg, 2007 | MR | Zbl

[45] Nielsen, F.; Boissonnat, J.D.; Nock, R Bregman Voronoi diagrams: properties, algorithms and applications (2007) no. 6154 (Technical report) | Zbl

[46] Prim, R. C. Shortest connection networks and some generalizations, Bell System Technology Journal, Volume 36 (1957), pp. 1389-1401

[47] Rockafellar, R. T. Convex Analysis, Princeton University Press, Princeton, New Jersey, 1970 | MR

[48] Saumard, A. The slope heuristics in heteroscedastic regression, Electronic Journal of Statistics, Volume 7 (2013), pp. 1184-1223 | MR | Zbl

[49] Schwarz, G. Estimating the dimension of a model, Annals of Statistics, Volume 6 (1978), pp. 33-73 | MR | Zbl

[50] Strehl, A.; Ghosh, J. Cluster ensembles - A knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research, Volume 3 (2002), pp. 583-617 | MR | Zbl

[51] Sandilya, S.; Kulkarni, S. R. Principal curves with bounded turn, IEEE Transactions on Information Theory, Volume 48 (2002), pp. 2789-2793 | MR | Zbl

[52] Tarsitano, A Mahalanobis metrics for k -means algorithms, Atti del Convegno intermedio SIS, Napoli (2003)