Let be a random element in a metric space , and let be a random variable with value or . is called the class, or the label, of . Let be an observed i.i.d. sample having the same law as . The problem of classification is to predict the label of a new random element . The -nearest neighbor classifier is the simple following rule: look at the nearest neighbors of in the trial sample and choose or for its label according to the majority vote. When , Stone (1977) proved in 1977 the universal consistency of this classifier: its probability of error converges to the Bayes error, whatever the distribution of . We show in this paper that this result is no longer valid in general metric spaces. However, if is separable and if some regularity condition is assumed, then the -nearest neighbor classifier is weakly consistent.
Classification : 62H30
Mots clés : classification, consistency, non parametric statistics
@article{PS_2006__10__340_0, author = {C\'erou, Fr\'ed\'eric and Guyader, Arnaud}, title = {Nearest neighbor classification in infinite dimension}, journal = {ESAIM: Probability and Statistics}, pages = {340--355}, publisher = {EDP-Sciences}, volume = {10}, year = {2006}, doi = {10.1051/ps:2006014}, mrnumber = {2247925}, language = {en}, url = {http://www.numdam.org/articles/10.1051/ps:2006014/} }
TY - JOUR AU - Cérou, Frédéric AU - Guyader, Arnaud TI - Nearest neighbor classification in infinite dimension JO - ESAIM: Probability and Statistics PY - 2006 DA - 2006/// SP - 340 EP - 355 VL - 10 PB - EDP-Sciences UR - http://www.numdam.org/articles/10.1051/ps:2006014/ UR - https://www.ams.org/mathscinet-getitem?mr=2247925 UR - https://doi.org/10.1051/ps:2006014 DO - 10.1051/ps:2006014 LA - en ID - PS_2006__10__340_0 ER -
Cérou, Frédéric; Guyader, Arnaud. Nearest neighbor classification in infinite dimension. ESAIM: Probability and Statistics, Tome 10 (2006), pp. 340-355. doi : 10.1051/ps:2006014. http://www.numdam.org/articles/10.1051/ps:2006014/
[1] On the kernel rule for function classification. submitted (2003). | Zbl 1100.62066
, and ,[2] On the kernel rule for function classification. IEEE Trans. Inform. Theory, to appear (2005). | MR 2235289
, and ,[3] Nearest neighbor pattern classification. IEEE Trans. Inform. Theory IT-13 (1967) 21-27. | Zbl 0154.44505
and ,[4] Nonparametric regression estimation when the regressor takes its values in a metric space, submitted (2001). | Zbl 1020.62034
and ,[5] On the almost everywhere convergence of nonparametric regression function estimates. Ann. Statist. 9 (1981) 1310-1319. | Zbl 0477.62025
,[6] On the strong universal consistency of nearest neighbor regression function estimates. Ann. Statist. 22 (1994) 1371-1385. | Zbl 0817.62038
, , and ,[7] A probabilistic theory of pattern recognition 31, Applications of Mathematics (New York). Springer-Verlag, New York (1996). | MR 1383093 | Zbl 0853.68150
, and ,[8] Measure theory and fine properties of functions. Studies in Advanced Mathematics. CRC Press, Boca Raton, FL (1992). | MR 1158660 | Zbl 0804.28001
and ,[9] Geometric measure theory. Die Grundlehren der mathematischen Wissenschaften, Band 153. Springer-Verlag New York Inc., New York (1969). | MR 257325 | Zbl 0176.00801
,[10] Gaussian measures and the density theorem. Comment. Math. Univ. Carolin. 22 (1981) 181-193. | Zbl 0459.28015
,[11] Dimension of metrics and differentiation of measures, in General topology and its relations to modern analysis and algebra, V (Prague, 1981), Sigma Ser. Pure Math., Heldermann, Berlin 3 (1983) 565-568. | Zbl 0502.28002
,[12] Differentiation of measures on Hilbert spaces, in Measure theory, Oberwolfach 1981 (Oberwolfach, 1981), Springer, Berlin. Lect. Notes Math. 945 (1982) 194-207. | Zbl 0495.28010
and ,[13] Consistent nonparametric regression. Ann. Statist. 5 (1977) 595-645. With discussion and a reply by the author. | Zbl 0366.62051
,Cité par Sources :