Agrégation séquentielle de prédicteurs : méthodologie générale et applications à la prévision de la qualité de l’air et à celle de la consommation électrique
[Sequential aggregation of predictors: General methodology and application to air-quality forecasting and to the prediction of electricity consumption]
Journal de la société française de statistique, Volume 151 (2010) no. 2, pp. 66-106.

This paper is an extended written version of the talk I delivered at the “XLe Journées de Statistique” in Ottawa, 2004, when being awarded the Marie-Jeanne Laurent-Duhamel prize. It is devoted to surveying some fundamental as well as some more recent results in the field of sequential prediction of individual sequences with expert advice. It then performs two empirical studies following the stated general methodology: the first one to air-quality forecasting and the second one to the prediction of electricity consumption. Most results mentioned in the paper are based on joint works with Yannig Goude (EDF R&D) and Vivien Mallet (INRIA), together with some students whom we co-supervised for their M.Sc. theses: Marie Devaine, Sébastien Gerchinovitz and Boris Mauricette.

Cet article fait suite à la conférence que j’ai eu l’honneur de donner lors de la réception du prix Marie-Jeanne Laurent-Duhamel, dans le cadre des XLe Journées de Statistique à Ottawa, en 2008. Il passe en revue les résultats fondamentaux, ainsi que quelques résultats récents, en prévision séquentielle de suites arbitraires par agrégation d’experts. Il décline ensuite la méthodologie ainsi décrite sur deux jeux de données, l’un pour un problème de prévision de qualité de l’air, l’autre pour une question de prévision de consommation électrique. La plupart des résultats mentionnés dans cet article reposent sur des travaux en collaboration avec Yannig Goude (EDF R&D) et Vivien Mallet (INRIA), ainsi qu’avec les stagiaires de master que nous avons co-encadrés  : Marie Devaine, Sébastien Gerchinovitz et Boris Mauricette.

Mot clés : Agrégation séquentielle, prévision avec experts, suites individuelles, prévision de la qualité de l’air, prévision de la consommation électrique
Keywords: Sequential aggregation of predictors, prediction with expert advice, individual sequences, air-quality forecasting, prediction of electricity consumption
@article{JSFS_2010__151_2_66_0,
     author = {Stoltz, Gilles},
     title = {Agr\'egation s\'equentielle de pr\'edicteurs~: m\'ethodologie g\'en\'erale et applications \`a la pr\'evision de la qualit\'e de l{\textquoteright}air et \`a celle de la consommation \'electrique},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {66--106},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {151},
     number = {2},
     year = {2010},
     mrnumber = {2741099},
     zbl = {1316.62169},
     language = {fr},
     url = {http://www.numdam.org/item/JSFS_2010__151_2_66_0/}
}
TY  - JOUR
AU  - Stoltz, Gilles
TI  - Agrégation séquentielle de prédicteurs : méthodologie générale et applications à la prévision de la qualité de l’air et à celle de la consommation électrique
JO  - Journal de la société française de statistique
PY  - 2010
SP  - 66
EP  - 106
VL  - 151
IS  - 2
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2010__151_2_66_0/
LA  - fr
ID  - JSFS_2010__151_2_66_0
ER  - 
%0 Journal Article
%A Stoltz, Gilles
%T Agrégation séquentielle de prédicteurs : méthodologie générale et applications à la prévision de la qualité de l’air et à celle de la consommation électrique
%J Journal de la société française de statistique
%D 2010
%P 66-106
%V 151
%N 2
%I Société française de statistique
%U http://www.numdam.org/item/JSFS_2010__151_2_66_0/
%G fr
%F JSFS_2010__151_2_66_0
Stoltz, Gilles. Agrégation séquentielle de prédicteurs : méthodologie générale et applications à la prévision de la qualité de l’air et à celle de la consommation électrique. Journal de la société française de statistique, Volume 151 (2010) no. 2, pp. 66-106. http://www.numdam.org/item/JSFS_2010__151_2_66_0/

[1] Antoniadis, A.; Brossat, X.; Cugliari, J.; Poggi, J.M. Clustering Functional Data Using Wavelets, Proceedings of the Nineteenth International Conference on Computational Statistics (COMPSTAT) (2010) | Zbl

[2] Auer, P.; Cesa-Bianchi, N.; Gentile, C. Adaptive and self-confident on-line learning algorithms, Journal of Computer and System Sciences, Volume 64 (2002), pp. 48-75 | DOI | MR | Zbl

[3] Antoniadis, A.; Paparoditis, E.; Sapatinas, T. A Functional Wavelet–kernel Approach for Time Series Prediction, Journal of the Royal Statistical Society : Series B, Volume 68 (2006) no. 5, pp. 837-857 | DOI | MR | Zbl

[4] Azoury, K.S.; Warmuth, M. Relative loss bounds for on-line density estimation with the exponential family of distributions, Machine Learning, Volume 43 (2001), pp. 211-246 | DOI | Zbl

[5] Bruhns, A.; Deurveilher, G.; Roy, J.-S. A Non-Linear Regression Model for Mid-Term Load Forecasting and Improvements in Seasonnality, Proceedings of the Fifteenth Power Systems Computation Conference (PSCC) (2005)

[6] Blackwell, D. An analog of the minimax theorem for vector payoffs, Pacific Journal of Mathematics, Volume 6 (1956), pp. 1-8 | DOI | MR | Zbl

[7] Blum, A.; Mansour, Y. From external to internal regret, Journal of Machine Learning Research, Volume 8 (2007), pp. 1307-1324 | MR | Zbl

[8] Besse, P.; Milhem, H.; Mestre, O.; Dufour, A.; Peuch, V.-H. Comparaison de techniques de data mining pour l’adaptation statistique des prévisions d’ozone du modèle de chimie-transport MOCAGE, Pollution Atmosphérique, Volume 195 (2007), pp. 285-292

[9] Cesa-Bianchi, N.; Freund, Y.; Haussler, D.; Helmbold, D.P.; Schapire, R.; Warmuth, M. How to use expert advice, Journal of the ACM, Volume 44 (1997) no. 3, pp. 427-485 | DOI | MR | Zbl

[10] Cesa-Bianchi, N.; Lugosi, G. Potential-based algorithms in on-line prediction and game theory, Machine Learning, Volume 51 (2003), pp. 239-261 | DOI | Zbl

[11] Cesa-Bianchi, N.; Lugosi, G. Prediction, Learning, and Games, Cambridge University Press, 2006 | DOI | MR | Zbl

[12] Cesa-Bianchi, N.; Mansour, Y.; Stoltz, G. Improved second-order bounds for prediction with expert advice, Machine Learning, Volume 66 (2007), pp. 321-352 | DOI

[13] Cesa-Bianchi, N. Analysis of two gradient-based algorithms for on-line regression, Journal of Computer and System Sciences, Volume 59 (1999) no. 3, pp. 392-411 | DOI | MR | Zbl

[14] Cover, T. Behavior of sequential predictors of binary sequences, Proceedings of the Fourth Prague Conference on Information Theory, Statistical Decision Functions, Random Processes, Maison d’édition de l’Académie des sciences de Tchécoslovaquie, Prague (1965), pp. 263-272 | MR

[15] Cover, T.M. Universal Portfolios, Mathematical Finance, Volume 1 (1991), pp. 1-29 | DOI | MR | Zbl

[16] Devaine, M.; Goude, Y.; Stoltz, G. Aggregation of sleeping predictors to forecast electricity consumption (2009) (Technical report Voir http://www.math.ens.fr/%7stoltz/DeGoSt-report.pdf )

[17] Devaine, M.; Goude, Y.; Stoltz, G. Forecasting of the eletrical consumption by aggregation of sleeping experts ; application to Slovakian and French country-wide hourly predictions (2010) (Voir http://www.math.ens.fr/%7Estoltz/publications )

[18] Dordonnat, V.; Koopman, S.J.; Ooms, M.; Dessertaine, A.; Collet, J. An Hourly Periodic State Space Model for Modelling French National Electricity Load, International Journal of Forecasting, Volume 24 (2008), pp. 566-587 | DOI

[19] Dani, V.; Madani, O.; Pennock, D.; Sanghai, S.; Galebach, B. An empirical comparison of algorithms for aggregating expert predictions, Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI) (2006)

[20] Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least Angle Regression, Annals of Statistics, Volume 32 (2004) no. 2, pp. 407-499 | DOI | MR | Zbl

[21] Foster, D. Prediction in the worst-case, Annals of Statistics, Volume 19 (1991), pp. 1084-1090 | DOI | MR | Zbl

[22] Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, Volume 55 (1997) no. 1, pp. 119-139 | DOI | MR | Zbl

[23] Freund, Y.; Schapire, R.; Singer, Y.; Warmuth, M. Using and combining predictors that specialize, Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing (STOC) (1997), pp. 334-343 | DOI | MR | Zbl

[24] Gerchinovitz, S. Communication personnelle, 2010

[25] Ghattas, B. Prévisions des pics d’ozone par arbres de régression, simples et agrégés par bootstrap , Revue de statistique appliquée, Volume 47 (1999) no. 2, pp. 61-80

[26] Gerchinovitz, S.; Mallet, V.; Stoltz, G. A further look at sequential aggregation rules for ozone ensemble forecasting (2008) (Technical report Voir http://www.math.ens.fr/%7Estoltz/GeMaSt-report.pdf )

[27] Goude, Y. Mélange de prédicteurs et application à la prévision de consommation électrique, Université Paris-Sud, janvier (2008) (Ph. D. Thesis Effectuée en convention avec EDF R&D.)

[28] Goude, Y. Tracking The Best Predictor With a Detection Based Algorithm, Proceedings of the Joint Statistical Meetings, American Statistical Association (2008) (Voir la section de “Statistical Computing”)

[29] Hannan, J. Approximation to Bayes risk in repeated play, Contributions to the Theory of Games, volume III (Dresher, M.; Tucker, A.; Wolfe, P., eds.), Princeton University Press, 1957, pp. 97-139 | MR | Zbl

[30] Hoerl, A.E.; Kennard, R.W. Ridge regression : biased estimation for nonorthogonal problems, Technometrics, Volume 12 (1970), pp. 55-67 | DOI | Zbl

[31] Herbster, M.; Warmuth, M. Tracking the best expert, Machine Learning, Volume 32 (1998), pp. 151-178 | DOI | Zbl

[32] Kivinen, J.; Warmuth, M. Exponentiated gradient versus gradient descent for linear predictors, Information and Computation, Volume 132 (1997) no. 1, pp. 1-63 | DOI | MR | Zbl

[33] Lugosi, G. Prédiction randomisée de suites individuelles, Journal de la Société Française de Statistique, Volume 147 (2006), pp. 5-37 | MR | Zbl

[34] Littlestone, N.; Warmuth, M. The weighted majority algorithm, Information and Computation, Volume 108 (1994), pp. 212-261 | DOI | MR | Zbl

[35] Lempel, A.; Ziv, J. On the complexity of an individual sequence, IEEE Transactions on Information Theory, Volume 22 (1976), pp. 75-81 | DOI | MR | Zbl

[36] Mallet, V. Ensemble Forecast of Analyses : Coupling Data Assimilation and Sequential Aggregation, Journal of Geophysical Research (2010) (Sous presse)

[37] Mallet, V.; Mauricette, B.; Stoltz, G. Description of Sequential Aggregation Methods and their Performances for Ozone Ensemble Forecasting (2007) no. DMA-07-08 http://www.dma.ens.fr/edition/publis/2007/resu0708.html (Technical report)

[38] Mallet, Vivien; Sportisse, Bruno Ensemble-based air quality forecasts : A multimodel approach applied to ozone, Journal of Geophysical Research, Volume 111 (2006) no. D18

[39] Mallet, Vivien; Stoltz, Gilles; Mauricette, Boris Ozone ensemble forecast with machine learning algorithms, Journal of Geophysical Research, Volume 114 (2009) no. D05307

[40] Pierrot, A.; Laluque, N.; Goude, Y. Short-term electricity load forecasting with generalized additive models, Proceedings of the Third International Conference on Computational and Financial Econometrics (CFE) (2009)

[41] Raftery, A.E.; Gneiting, T.; Balabdaoui, F.; Polakowski, M. Using Bayesian Model Averaging to Calibrate Forecast Ensembles, Monthly Weather Review, Volume 133 (2005), pp. 1155-1174 | DOI

[42] Tibshirani, R. Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society, Series B, Volume 58 (1996) no. 1, pp. 267-288 | MR | Zbl

[43] Vovk, V. Competitive on-line statistics, International Statistical Review, Volume 69 (2001), pp. 213-248 | DOI | Zbl

[44] Vovk, V. Aggregating strategies, Proceedings of the Third Annual Workshop on Computational Learning Theory (COLT) (1990), pp. 372-383

[45] Vovk, V. A game of prediction with expert advice, Journal of Computer and System Sciences, Volume 56 (1998) no. 2, pp. 153-173 | DOI | MR | Zbl

[46] Vovk, V.; Zhdanov, F. Prediction with Expert Advice for the Brier Game, Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML) (2008) | DOI | MR | Zbl

[47] Wood, S.N. Generalized Additive Models : An Introduction with R, Chapman and Hall/CRC, 2006 | DOI | MR | Zbl

[48] Ziv, J. Coding theorems for individual sequences, IEEE Transactions on Information Theory, Volume 24 (1978), pp. 405-412 | DOI | MR | Zbl

[49] Ziv, J. Distortion-rate theory for individual sequences, IEEE Transactions on Information Theory, Volume 26 (1980), pp. 137-143 | DOI | MR | Zbl

[50] Ziv, J.; Lempel, A. A universal algorithm for sequential data-compression, IEEE Transactions on Information Theory, Volume 23 (1977), pp. 337-343 | DOI | MR | Zbl