Prédiction randomisée de suites individuelles

Lugosi, Gábor

Lugosi, Gábor

Journal de la Société française de statistique, Tome 147 (2006) no. 1, pp. 5-37.

Export
Comment citer

@article{JSFS_2006__147_1_5_0,
     author = {Lugosi, G\'abor},
     title = {Pr\'ediction randomis\'ee de suites individuelles},
     journal = {Journal de la Soci\'et\'e fran\c{c}aise de statistique},
     pages = {5--37},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {147},
     number = {1},
     year = {2006},
     language = {fr},
     url = {http://www.numdam.org/item/JSFS_2006__147_1_5_0/}
}

TY  - JOUR
AU  - Lugosi, Gábor
TI  - Prédiction randomisée de suites individuelles
JO  - Journal de la Société française de statistique
PY  - 2006
SP  - 5
EP  - 37
VL  - 147
IS  - 1
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2006__147_1_5_0/
LA  - fr
ID  - JSFS_2006__147_1_5_0
ER  -

%0 Journal Article
%A Lugosi, Gábor
%T Prédiction randomisée de suites individuelles
%J Journal de la Société française de statistique
%D 2006
%P 5-37
%V 147
%N 1
%I Société française de statistique
%U http://www.numdam.org/item/JSFS_2006__147_1_5_0/
%G fr
%F JSFS_2006__147_1_5_0

Lugosi, Gábor. Prédiction randomisée de suites individuelles. Journal de la Société française de statistique, Tome 147 (2006) no. 1, pp. 5-37. http://www.numdam.org/item/JSFS_2006__147_1_5_0/

Bibliographie
Cité par

[1] Auer P.(2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3 :397-422. | MR | Zbl

[2] Auer P., Cesa-Bianchi N., Freund Y. et Schapire R.(2002). The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32 :48-77. | MR | Zbl

[3] Auer P. et Long P.M.(1999). Structural results for online learning models with and without queries. Machine Learning, 36 :147-181. | Zbl

[4] Awerbuch B. et Kleinberg R. D.(2004). Adaptive routing with end-to-end feedback : distributed learning and geometric approaches. In Proceedings of the 36th ACM Symposium on the Theory of Computing, pages 45-53. ACM Press. | MR | Zbl

[5] Azuma K.(1967). Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, 68 :357-367. | MR | Zbl

[6] Baños A.(1968). On pseudo-games. Annals of Mathematical Statistics, 39 :1932-1945. | MR | Zbl

[7] Berry D.A. et Fristedt B.(1985). Bandit Problems. Chapman and Hall. | MR

[8] Blackwell D.(1956). An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, 6 :1-8. | MR | Zbl

[9] Blackwell D.(1956). Controlled random walks. In Proceedings of the International Congress of Mathematicians, 1954, volume III, pages 336-338. North-Holland. | MR | Zbl

[10] Bousquet O. et Warmuth M.(2002). Tracking a small set of experts by mixing past posteriors. Journal of Machine Learning Research, 3 :363-396. | MR | Zbl

[11] Cesa-Bianchi N.(1999). Analysis of two gradient-based algorithms for on-line regression. Journal of Computer and System Sciences, 59(3) :392-411. | MR | Zbl

[12] Cesa-Bianchi N., Freund Y., Haussler D., Helmbold D.P., Schapire R. et Warmuth M.(1997). How to use expert advice. Journal of the ACM, 44(3) :427-485. | MR | Zbl

[13] Cesa-Bianchi N. et Lugosi G.(1999). On prediction of individual sequences. Annals of Statistics, 27 :1865-1895. | MR | Zbl

[14] Cesa-Bianchi N. et Lugosi G.(2003). Potential-based algorithms in on-line prediction and game theory. Machine Learning, 51 :239-261. | Zbl

[15] Cesa-Bianchi N. et Lugosi G.(2006). Prediction, Learning, and Games. Cambridge University Press, New York. | MR | Zbl

[16] Cesa-Bianchi N., Lugosi G. et Stoltz G.(2004). Minimizing regret with label efficient prediction. IEEE Transactions on Information Theory, 51 :2152-2162. | MR | Zbl

[17] Cesa-Bianchi N., Lugosi G. et Stoltz G.(2004). Regret minimization under partial monitoring. Prépublication DMA-04-18, Ecole normale supérieure, Paris.

[18] Cover T.(1965). Behavior of sequential predictors of binary sequences. In Proceedings of the 4th Prague Conference on Information Theory, Statistical Decision Functions, Random Processes, pages 263-272. Maison d'édition de l'Académie des sciences de Tchécoslovaquie, Prague. | MR

[19] Feder M., Merhav N. et Gutman M.(1992). Universal prediction of individual sequences. IEEE Transactions on Information Theory, 38 :1258-127. | MR | Zbl

[20] Foster D.(1991). Prediction in the worst-case. Annals of Statistics, 19 :1084-1090. | MR | Zbl

[21] Foster D. et Vohra R.(1998). Asymptotic calibration. Biometrika, 85 :379-390. | MR | Zbl

[22] Foster D. et Vohra R.(1999). Regret in the on-line decision problem. Games and Economic Behavior, 29 :7-36. | MR | Zbl

[23] Freedman D.A.(1975). On tail probabilities for martingales. Annals of Probability, 3 :100-118. | MR | Zbl

[24] Fudenberg D. et Levine D.K.(1998). The Theory of Learning in Games. MIT Press. | MR | Zbl

[25] Gittins J.C.(1989). Multi-Armed Bandit Allocation Indices. Wiley-lnterscience series in Systems and Optimization. Wiley. | MR | Zbl

[26] György A., Linder T. et Lugosi G.(2004). Efficient algorithms and minimax bounds for zero-delay lossy source coding. IEEE Transactions on Signal Processing, 52 :2337-2347.

[27] György A., Linder T. et Lugosi G.(2004). A "follow the perturbed leader"-type algorithm for zero-delay quantization of individual sequences. In Data Compression Conference, pages 342-351.

[28] György A., Linder T. et Lugosi G.(2005). Tracking the best of many experts. In Proceedings of the 18th Annual Conference on Learning Theory, pages 204-216. | Zbl

[29] György A., Linder T., Lugosi G. et Ottucsâk Gy.(2005). The on-line shortest path bandit problem. Manuscrit.

[30] Hannan G.(1957). Approximation to Bayes risk in repeated play. Contributions to the theory of games, 3 :97-139. | Zbl

[31] Hart S. et Mas-Colell A.(2000). A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68 :1127-1150. | MR | Zbl

[32] Hart S. et Mas-Colell A.(2001) A general class of adaptive strategies. Journal of Economic Theory, 98 :26-54. | MR | Zbl

[33] Hart S. et Mas-Colell A.(2002). A reinforcement procedure leading to correlated equilibrium. In Economic Essays : A Festschrift for Werner Hildenbrand, pages 181-200. Springer. | MR | Zbl

[34] Haussler D., Kivinen J. et Warmuth M.(1998). Sequential prediction of individual sequences under general loss functions. IEEE Transactions on Information Theory, 44 :1906-1925, 1998. | MR | Zbl

[35] Helmbold D.P., Littlestone N. et Long P.M.(2000). Apple tasting. Information and Computation, 161(2) :85-139. | MR | Zbl

[36] Helmbold D.P. et Panizza S.(1997). Some label efficient learning results. In Proceedings of the 10th Annual Conference on Computational Learning Theory, pages 218-230. ACM Press.

[37] Helmbold D.P. et Schapire R.(1997). Predicting nearly as well as the best pruning of a decision tree. Machine Learning, 27(1) :51-68.

[38] Helmbold D.P. et Warmuth M.(1998). Tracking the best expert. Machine Learning, 32(2) :151-178. | Zbl

[39] Hoeffding W.(1963). Probability inequalities for sums of bounded random variables. Journal ofthe American Statistical Association, 58 :13-30. | MR | Zbl

[40] Hutter M. et Poland J.(2004). Prediction with expert advice by following the perturbed and penalized leader. Prépublication IDSIA-20-04, Istituto Dalle Molle di Studi sull'Tntelligenza Artificiale, Suisse.

[41] Kalai A. et Vempala S.(2003). Efficient algorithms for the online decision problem. In Proceedings of the 16th Annual Conference on Learning Theory, pages 26-40. Springer.

[42] Lai T.L. et Robbins H.(1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6 :4-22. | MR | Zbl

[43] Lempel A. et Ziv J.(1976). On the complexity of an individual sequence. IEEE Transactions on Information Theory, 22 :75-81. | MR | Zbl

[44] Littlestone N. et Warmuth M.(1994). The weighted majority algorithm. Information and Computation, 108 :212-261. | MR | Zbl

[45] Mannor S. et Shimkin N.(2003). On-line learning with imperfect monitoring. In Proceedings of the 16th Annual Conference on Learning Theory, pages 552-567. Springer. | Zbl

[46] Mcmahan H. B. et Blum A.(2004). Online geometric optimization in the bandit setting against an adaptive adversary. In Proceedings of the 17th Annual Conference on Learning Theory, pages 109-123. Springer, 2004. | MR | Zbl

[47] Megiddo N.(1980). On repeated games with incomplete information played by non-Bayesian players. International Journal of Game Theory, 9 :157-167. | MR | Zbl

[48] Merhav N. et Feder M.(1993). Universal schemes for sequential decision from individual data sequences. IEEE Transactions on Information Theory, 39.1280-1292. | MR | Zbl

[49] Merhav N. et Feder M.(1998). Universal prediction. IEEE Transactions on Information Theory, 44 :2124-2147. | MR | Zbl

[50] Mertens J.-F., Sorin S. et Zamir S.(1994). Repeated games. core Discussion paper, numéros 9420, 9421, 9422, Louvain-la-Neuve, Belgique.

[51] Neveu J.(1975). Discrete Parameter Martingales. North-Holland. | MR

[52] Piccolboni A. et Schindelhauer C.(2001). Discrete prediction games with arbitrary feedback and loss. In Proceedings of the 14th Annual Conference on Computational Learning Theory, pages 208-223. | MR | Zbl

[53] Robbins H.(1952). Some aspects of the sequential design of experiments. Bullettin of the American Mathematical Society, 55 :527-535. | MR | Zbl

[54] Rustichini A.(1999). Minimizing regret : The general case. Games and Economic Behavior, 29 :224-243. | MR | Zbl

[55] Stoltz G.(2005). Information incomplète et regret interne en prédiction de suites individuelles. Thèse de doctorat, Université Paris-Sud, Orsay.

[56] Takimoto E. et Warmuth M.(2004). Path kernels and multiplicative updates. Journal of Machine Learning Research, 4 :773-818. | MR | Zbl

[57] Vovk V.(1990). Aggregating strategies. In Proceedings of the 3rd Annual Workshop on Computational Learning Theory, pages 372-383.

[58] Vovk V.(1998). A game of prediction with expert advice. Journal of Computer and System Sciences, 56 :153-173. | MR | Zbl

[59] Vovk V.(1999). Derandomizing stochastic prediction strategies. Machine Learning, 35 :247-282. | Zbl

[60] Vovk V.(2001). Competitive on-line statistics. International Statistical Review, 69 :213-248. | Zbl

[61] Weissman T. et Merhav N.(2001). Universal prediction of binary individual sequences in the presence of noise. IEEE Transactions on Information Theory, 47 :2151-2173. | MR | Zbl

[62] Weissman T., Merhav N. et Somekh-Baruch (2001). Twofold universal prediction schemes for achieving the finite state predictability of a noisy individual binary sequence. IEEE Transactions on Information Theory, 47 :1849-1866. | MR | Zbl

[63] Ziv J.(1978). Coding theorems for individual sequences. IEEE Transactions on Information Theory, 24 :405-412. | MR | Zbl

[64] Ziv J.(1980). Distortion-rate theory for individual sequences. IEEE Transactions on Information Theory, 26 :137-143. | MR | Zbl

[65] Ziv J. et Lempel A.(1977). A universal algorithm for sequential data-compression. IEEE Transactions on Information Theory, 23 :337-343. | MR | Zbl