On two extensions of the vector quantization scheme
[Sur deux extensions du principe de quantification vectorielle]
Journal de la société française de statistique, Tome 156 (2015) no. 1, pp. 51-75.

Dans cet article, nous présentons des résultats relatifs à deux extensions différentes de la quantification vectorielle et de la question liée de classification par la méthode des centres mobiles. La première partie de l’article concerne la performance théorique de la quantification et du clustering avec des divergences de Bregman ; la seconde est dédiée à des problèmes de sélection de modèle pour les courbes principales. Chaque partie est complétée par quelques illustrations numériques.

In this paper, we present results pertaining to two different extensions of vector quantization and the related question of k - means clustering. The first part of the paper is about the theoretical performance of quantization and clustering with Bregman divergences. The second one is dedicated to model selection issues for principal curves. Some numerical illustrations are provided in each case.

Mots clés : quantification, classification non supervisée, divergences de Bregman, courbes principales, sélection de modèle
@article{JSFS_2015__156_1_51_0,
     author = {Fischer, Aur\'elie},
     title = {On two extensions of the vector quantization scheme},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {51--75},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {156},
     number = {1},
     year = {2015},
     zbl = {1316.62087},
     mrnumber = {3338240},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2015__156_1_51_0/}
}
TY  - JOUR
AU  - Fischer, Aurélie
TI  - On two extensions of the vector quantization scheme
JO  - Journal de la société française de statistique
PY  - 2015
DA  - 2015///
SP  - 51
EP  - 75
VL  - 156
IS  - 1
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2015__156_1_51_0/
UR  - https://zbmath.org/?q=an%3A1316.62087
UR  - https://www.ams.org/mathscinet-getitem?mr=3338240
LA  - en
ID  - JSFS_2015__156_1_51_0
ER  - 
Fischer, Aurélie. On two extensions of the vector quantization scheme. Journal de la société française de statistique, Tome 156 (2015) no. 1, pp. 51-75. http://www.numdam.org/item/JSFS_2015__156_1_51_0/

[1] Alber, Y.; Butnariu, D. Convergence of Bregman projection methods for solving consistent convex feasibility problems in reflexive Banach spaces, Journal of Optimization Theory and Applications, Volume 92 (1997), pp. 33-61 | MR 1428127 | Zbl 0886.90179

[2] Art, D.; Gnanadesikan, R.; Kettenring, J. R. Data-based metrics for cluster analysis, Utilitas Mathematica, Volume 21A (1982), pp. 75-99 | MR 683903 | Zbl 0501.62050

[3] Alcorn, T. M.; Hoggar, C. W. Preprocessing of data for character recognition, Marconi Review (1969), pp. 61-81

[4] Akaike, H. Information theory and an extension of the maximum likelihood principle, Proceedings of the 2nd International Symposium on Information Theory (1973), pp. 267-281 | MR 483125 | Zbl 0283.62006

[5] Arlot, S.; Massart, P. Data-driven calibration of penalties for least-squares regression, Journal of Machine Learning Research, Volume 10 (2009), pp. 245-279

[6] Alexandrov, A. D.; Reshetnyak, Y. G. General Theory of Irregular Curves, Mathematics and its Applications, Kluwer Academic Publishers, Dordrecht, 1989 | MR 1117220 | Zbl 0691.53002

[7] Bauschke, H. H.; Borwein, J. M.; Combettes, P. L. Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces, Communications in Contemporary Mathematics, Volume 3 (2001), pp. 615-647 | MR 1869107 | Zbl 1032.49025

[8] Barron, A.; Birgé, L.; Massart, P. Risk bounds for model selection via penalization, Probability Theory and Related Fields, Volume 113 (1999), pp. 301-413 | MR 1679028 | Zbl 0946.62036

[9] Biau, G.; Devroye, L.; Lugosi, G. On the performance of clustering in Hilbert spaces, IEEE Transactions on Information Theory, Volume 54 (2008), pp. 781-790 | MR 2444554 | Zbl 1304.62088

[10] Biau, G.; Fischer, A. Parameter selection for principal curves, IEEE Transactions on Information Theory, Volume 58 (2012), pp. 1924-1939 | MR 2932874 | Zbl 1365.62262

[11] Banerjee, A.; Guo, X.; Wang, H. On the optimality of conditional expectation as a Bregman predictor, IEEE Transactions on Information Theory, Volume 51 (2005) | MR 2246384 | Zbl 1284.94025

[12] Birgé, L.; Massart, P. Gaussian model selection, Journal of the European Mathematical Society, Volume 3 (2001), pp. 203-268 | MR 1848946 | Zbl 1037.62001

[13] Birgé, L.; Massart, P. Minimal penalties for Gaussian model selection, Probability Theory and Related Fields, Volume 138 (2007), pp. 33-73 | MR 2288064 | Zbl 1112.62082

[14] Birgé, L.; Massart, P. From model selection to adaptive estimation, Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics (Pollard, D.; Torgersen, E.; Yang, G., eds.), Springer, New York, 1997, pp. 55-87 | MR 1462939 | Zbl 0920.62042

[15] Banerjee, A.; Merugu, S.; Dhillon, I. S.; Ghosh, J. Clustering with Bregman divergences, Journal of Machine Learning Research, Volume 6 (2005), pp. 1705-1749 | MR 2249870 | Zbl 1190.62117

[16] Baudry, J.-P.; Maugis, C.; Michel, B. Slope heuristics: overview and implementation, Statistics and Computing, Volume 22 (2012), pp. 455-470 | MR 2865029 | Zbl 1322.62007

[17] Bregman, L. M. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Computational Mathematics and Mathematical Physics, Volume 7 (1967), pp. 200-217 | MR 215617 | Zbl 0186.23807

[18] Brunsdon, C. Path estimation from GPS tracks, Proceedings of the 9th International Conference on GeoComputation, National Centre for Geocomputation, National University of Ireland, Maynooth, Eire (2007)

[19] Cesa-Bianchi, N.; Lugosi, G. Prediction, Learning, and Games, Cambridge University Press, New York, 2006 | MR 2409394 | Zbl 1114.91001

[20] Caillerie, C.; Michel, B. Model selection for simplicial approximation, Foundations of Computational Mathematics, Volume 11 (2011), pp. 707-731 | MR 2859953 | Zbl 1231.62123

[21] Csiszár, I. Generalized projections for non-negative functions, Acta Mathematica Hungarica, Volume 68 (1995), pp. 161-185 | MR 1320794 | Zbl 0837.62006

[22] Deutsch, E. S. Preprocessing for character recognition, Proceedings of the IEE–NPL Conference on Pattern Recognition (1968), pp. 179-190

[23] Engdahl, E. R.; Villaseñor, A. Global seismicity: 1900–1999, International Handbook of Earthquake and Engineering Seismology (Lee, W.H.K.; Kanamori, H.; Jennings, P.C.; Kisslinger, C., eds.), Academic Press, London, 2002, pp. 665-690

[24] Fischer, A. Quantization and clustering with Bregman divergences, Journal of Multivariate Analysis, Volume 101 (2010), pp. 2207-2221 | MR 2671211 | Zbl 1201.62080

[25] Fischer, A. Selecting the length of a principal curve within a Gaussian model, Electronic Journal of Statistics, Volume 7 (2013), pp. 342-363 | MR 3020424 | Zbl 1337.62074

[26] Fischer, A. Deux méthodes d’apprentissage non supervisé: synthèse sur la méthode des centres mobiles et présentation des courbes principales, Journal de la Société Française de Statistique, Volume 155 (2014), pp. 2-35 | MR 3211751 | Zbl 1316.62086

[27] Frigyik, B. A.; Srivastava, S.; Gupta, M. R. An introduction to functional derivatives (2008) no. UWEETR-2008-0001 (Technical report)

[28] Frigyik, B. A.; Srivastava, S.; Gupta, M. R. Functional Bregman divergence and Bayesian estimation of distributions, IEEE Transactions on Information Theory, Volume 54 (2008), pp. 5130-5139 | MR 2589887 | Zbl 1319.62137

[29] Gray, R. M.; Buzo, A.; Gray, A. H.; Matsuyama, Y. Distortion measures for speech processing, IEEE Transactions on Acoustics, Speech and Signal Processing, Volume 28 (1980), pp. 367-376 | Zbl 0524.94011

[30] Gersho, A.; Gray, R. M. Vector Quantization and Signal Compression, Kluwer Academic Publishers, Norwell, 1992 | Zbl 0782.94001

[31] Graf, S.; Luschgy, H. Foundations of Quantization for Probability Distributions, Lecture Notes in Mathematics, Springer-Verlag, Berlin, Heidelberg, 2000 | MR 1764176 | Zbl 0951.60003

[32] Hastie, T.; Stuetzle, W. Principal curves, Journal of the American Statistical Association, Volume 84 (1989), pp. 502-516 | MR 1010339 | Zbl 0679.62048

[33] Jones, L.; Byrne, C. General entropy criteria for inverse problems, with applications to data compression, pattern classification, and cluster analysis, IEEE Transactions on Information Theory, Volume 36 (1990) | MR 1043277 | Zbl 0731.62016

[34] Jordan, M. I.; Nguyen, X.; Wainwright, M. J. Estimating divergence functionals and the likelihood ratio by convex risk minimization, IEEE Transactions on Information Theory, Volume 56 (2010), pp. 5847-5861 | MR 2808937 | Zbl 1366.62071

[35] Kolmogorov, A. N.; Fomin, S. V. Introductory Real Analysis, Dover Publications, Mineola, 1975 | MR 377445

[36] Kégl, B.; Krzyżak, A.; Linder, T.; Zeger, K. Learning and design of principal curves, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 22 (2000), pp. 281-297

[37] Kruskal, J. On the shortest spanning subtree of a graph and the traveling salesman problem, Proceedings of the American Mathematical Society, Volume 7 (1956), pp. 48-50 | MR 78686 | Zbl 0070.18404

[38] Kégl, B. Principal Curves: Learning, Design, and Applications (1999) (Ph. D. Thesis)

[39] Lebarbier, E. Detecting multiple change-points in the mean of Gaussian process by model selection, Signal Processing, Volume 85 (2005), pp. 717-736 | Zbl 1148.94403

[40] Lerasle, M. Optimal model selection in density estimation, Annales de l’Institut Henri Poincaré, Volume 48 (2012), pp. 884-908 | MR 2976568 | Zbl 1244.62052

[41] Linder, T. Learning-theoretic methods in vector quantization, Principles of Nonparametric Learning (Györfi, L., ed.), Springer-Verlag, Wien, 2002 | MR 1987659

[42] Lloyd, S. P. Least squares quantization in PCM, IEEE Transactions on Information Theory, Volume 28 (1982), pp. 129-137 | MR 651807 | Zbl 0504.94015

[43] Mallows, C. L. Some comments on C p , Technometrics, Volume 15 (1973), pp. 661-675 | Zbl 0269.62061

[44] Massart, P. Concentration Inequalities and Model Selection (Picard, Jean, ed.), Ecole d’Eté de Probabilités de Saint-Flour XXXIII – 2003, Lecture Notes in Mathematics, Springer, Berlin, Heidelberg, 2007 | MR 2319879 | Zbl 1170.60006

[45] Nielsen, F.; Boissonnat, J.D.; Nock, R Bregman Voronoi diagrams: properties, algorithms and applications (2007) no. 6154 (Technical report) | Zbl 1302.68292

[46] Prim, R. C. Shortest connection networks and some generalizations, Bell System Technology Journal, Volume 36 (1957), pp. 1389-1401

[47] Rockafellar, R. T. Convex Analysis, Princeton University Press, Princeton, New Jersey, 1970 | MR 274683

[48] Saumard, A. The slope heuristics in heteroscedastic regression, Electronic Journal of Statistics, Volume 7 (2013), pp. 1184-1223 | MR 3056072 | Zbl 1337.62083

[49] Schwarz, G. Estimating the dimension of a model, Annals of Statistics, Volume 6 (1978), pp. 33-73 | MR 468014 | Zbl 0379.62005

[50] Strehl, A.; Ghosh, J. Cluster ensembles - A knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research, Volume 3 (2002), pp. 583-617 | MR 1991087 | Zbl 1084.68759

[51] Sandilya, S.; Kulkarni, S. R. Principal curves with bounded turn, IEEE Transactions on Information Theory, Volume 48 (2002), pp. 2789-2793 | MR 1930347 | Zbl 1062.62506

[52] Tarsitano, A Mahalanobis metrics for k -means algorithms, Atti del Convegno intermedio SIS, Napoli (2003)