Indices de Sobol généralisés aux variables dépendantes : tests de performance de l’algorithme HOGS couplé à plusieurs estimateurs paramétriques
Journal de la société française de statistique, Tome 158 (2017) no. 1, pp. 68-89.

L’algorithme “Hierarchically Orthogonal Gram–Schmidt” (HOGS) ( Chastaing et al., 2015 ) estime des indices de Sobol généralisés aux modèles à entrées dépendantes, quantifiant explicitement la sensibilité du modèle due aux corrélations. HOGS construit un méta-modèle pour chaque variable d’intérêt par projection sur une base fonctionnelle bien choisie pour le calcul des indices. Les coefficients de projection sont obtenus par l’estimateur des moindres carrés (OLS) ou les régressions pénalisées lasso, ridge et Elastic Net (EN). Quatre cas d’étude sont proposés : trois modèles simples permettent d’appréhender le fonctionnement de HOGS, et le modèle LNAS (Log-Normal Allocation and Senescence) dédié à la dynamique complexe de la croissance des plantes. Plusieurs configurations de HOGS et la pertinence du méta-modèle sont étudiées grâce à un indice de consistance. L’interprétation des indices de Sobol est illustrée grâce à LNAS. En conclusion, HOGS-OLS est la méthode la plus performante lorsque les ressources informatiques ne sont pas limitantes. Dans le cas contraire, la question de l’estimation paramétrique avec hypothèse de sparsité met en évidence que : i) EN est plus robuste mais plus coûteux numériquement que le Lasso, ii) HOGS génère une base trop grande, créant de la sparsité artificielle. Un amendement de HOGS a été proposé pour réduire la dimension de la base.

The algorithm “Hierarchically Orthogonal Gram-Schmidt” (HOGS) ( Chastaing et al., 2015 ) estimates generalized Sobol indices dedicated to models with dependent inputs, quantifying explicitly the model sensitivity due to correlations. HOGS constructs a meta-model for each variable of interest by projection on a functional basis suited to indices calculation. Regression coefficients are obtained with the ordinary least-square estimator (OLS) or penalized regression methods Lasso, Ridge and Elastic Net (EN). Four study cases are proposed: three toy models allowing to investigate HOGS functioning and numerical properties, and the LNAS (Log-Normal Allocation and Senescence) model dedicated to the complex dynamics of plant growth. Several HOGS configurations and meta-model accuracy are tested by means of a consistency index. An interpretation of Sobol indices is given for LNAS. It appears that HOGS-OLS is the most efficient method when simulation resources are not limited. Otherwise, considering the issue of parameter estimation with sparsity highlights that: i) EN is more robust but more costly than Lasso, ii) the basis constructed by HOGS is too large which creates artificial sparsity. A modification of HOGS has been proposed to reduce the dimension of the basis.

Mots clés : analyse de sensibilité, HDMR, HOGS, régression pénalisée
@article{JSFS_2017__158_1_68_0,
     author = {Sainte-Marie, Julien and Viaud, Gautier and Courn\`ede, Paul-Henry},
     title = {Indices de {Sobol} g\'en\'eralis\'es aux variables d\'ependantes~: tests de performance de l{\textquoteright}algorithme {HOGS} coupl\'e \`a plusieurs estimateurs param\'etriques},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {68--89},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {158},
     number = {1},
     year = {2017},
     zbl = {1361.49021},
     language = {fr},
     url = {http://www.numdam.org/item/JSFS_2017__158_1_68_0/}
}
TY  - JOUR
AU  - Sainte-Marie, Julien
AU  - Viaud, Gautier
AU  - Cournède, Paul-Henry
TI  - Indices de Sobol généralisés aux variables dépendantes : tests de performance de l’algorithme HOGS couplé à plusieurs estimateurs paramétriques
JO  - Journal de la société française de statistique
PY  - 2017
DA  - 2017///
SP  - 68
EP  - 89
VL  - 158
IS  - 1
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2017__158_1_68_0/
UR  - https://zbmath.org/?q=an%3A1361.49021
LA  - fr
ID  - JSFS_2017__158_1_68_0
ER  - 
Sainte-Marie, Julien; Viaud, Gautier; Cournède, Paul-Henry. Indices de Sobol généralisés aux variables dépendantes : tests de performance de l’algorithme HOGS couplé à plusieurs estimateurs paramétriques. Journal de la société française de statistique, Tome 158 (2017) no. 1, pp. 68-89. http://www.numdam.org/item/JSFS_2017__158_1_68_0/

[1] Bezanson, Jeff; Edelman, Alan; Karpinski, Stefan; Shah, Viral B. Julia : A Fresh Approach to Numerical Computing, CoRR, Volume abs/1411.1607 (2014) | Zbl 1356.68030

[2] Caniou, Yann Global sensitivity analysis for nested and multiscale modelling (2012) (Ph. D. Thesis)

[3] Chen, Yuting; Cournède, Paul-Henry Data assimilation to reduce uncertainty of crop model prediction with convolution particle filtering, Ecological Modelling, Volume 290 (2014), pp. 165-177

[4] Champion, Magali; Chastaing, Gaëlle; Gadat, Sebastien; Prieur, Clémentine L2-Boosting for sensitivity analysis with dependent inputs, Volume 25 (2015) no. 4, pp. 1477-1502 (Statistica Sinica) | MR 3409077 | Zbl 1377.62156

[5] Cournède, P.-H.; Chen, Y.; Wu, Q.; Baey, C.; Bayol, B. Development and Evaluation of Plant Growth Models : Methodology and Implementation in the PyGMAlion platform., Mathematical Modelling of Natural Phenomena, Volume 8 (2013) no. 4, pp. 112-130 | MR 3110487 | Zbl 1281.92001

[6] Chastaing, Gaëlle; Gamboa, Fabrice; Prieur, Clémentine Generalized hoeffding-sobol decomposition for dependent variables-application to sensitivity analysis, Electronic Journal of Statistics, Volume 6 (2012), pp. 2420-2448 | MR 3020270 | Zbl 1334.62098

[7] Chastaing, G.; Gamboa, F.; Prieur, C. Generalized Sobol sensitivity indices for dependent variables : numerical methods, Journal of Statistical Computation and Simulation, Volume 85 (2015) no. 7, pp. 1306-1333 | MR 3306799 | Zbl 07183136

[8] Chastaing, G Indices de Sobol généralisés pour variables dépendantes (2013) (Ph. D. Thesis)

[9] Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent, Journal of Statistical Software, Volume 33 (2010) no. 1, pp. 1-22 (February)

[10] Hoeffding, Wassily A Class of Statistics with Asymptotically Normal Distribution, The Annals of Mathematical Statistics, Volume 19 (1948) no. 3, pp. 293-325 | MR 26294 | Zbl 0032.04101

[11] Hooker, Giles Generalized Functional ANOVA Diagnostics for High-Dimensional Functions of Dependent Variables, Journal of Computational and Graphical Statistics, Volume 16 (2007) no. November, pp. 709-732 | MR 2351087

[12] Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome The elements of statistical learning : data mining, inference and prediction, Springer, 2009 | MR 2722294

[13] Kucherenko, S.; Tarantola, S.; Annoni, P. Estimation of global sensitivity indices for models with dependent variables, Computer Physics Communications, Volume 183 (2012) no. 4, pp. 937-946 | MR 2880927 | Zbl 1261.62062

[14] Letort, V.; Mahe, P.; Cournède, P.-H.; de Reffye, P.; Courtois, B. Quantitative Genetics and Functional-Structural Plant Growth Models : Simulation of Quantitative Trait Loci Detection for Model Parameters and Application to Potential Yield Optimization, Annals of Botany, Volume 101 (2008) no. 8, pp. 951-963

[15] Lecoeur, Jérémie; Poiré-Lassus, Richard; Christophe, Angélique; Pallas, Benoît; Casadebaig, Pierre; Debaeke, Philippe; Vear, Felicity; Guilioni, Lydie Quantifying physiological determinants of genetic variation for yield potential in sunflower. SUNFLO : a model-based analysis., Functional Plant Biology (2011) no. 38, pp. 246-259

[16] Li, G.; Rabitz, H. General formulation of HDMR component functions with independent and correlated variables, Journal of Mathematical Chemistry, Volume 50 (2012) no. 1, pp. 99-130 | MR 2873036 | Zbl 1320.62232

[17] Li, Genyuan; Rabitz, Herschel Analytical HDMR formulas for functions expressed as quadratic polynomials with a multivariate normal distribution, Journal of Mathematical Chemistry, Volume 52 (2014) no. 8, pp. 2052-2073 | MR 3240952 | Zbl 1395.62211

[18] Li, Genyuan; Rey-de-Castro, Roberto; Rabitz, Herschel D-MORPH regression for modeling with fewer unknown parameters than observation data, Journal of Mathematical Chemistry, Volume 50 (2012) no. 7, pp. 1747-1764 | MR 2950955 | Zbl 1314.62154

[19] Li, Genyuan; Rosenthal, Carey; Rabitz, Herschel High Dimensional Model Representations, The Journal of Physical Chemistry A, Volume 105 (2001) no. 33, pp. 7765-7777

[20] Li, Genyuan; Rabitz, Herschel; Yelvington, Paul E; Oluwole, Oluwayemisi O; Bacon, Fred; Kolb, Charles E; Schoendorf, Jacqueline Global sensitivity analysis for systems with independent and/or correlated inputs., The journal of physical chemistry. A, Volume 114 (2010) no. 19, p. 6022-32

[21] Matsumoto, Makoto; Nishimura, Takuji Mersenne Twister : A 623-dimensionally Equidistributed Uniform Pseudo-random Number Generator, ACM Trans. Model. Comput. Simul., Volume 8 (1998) no. 1, pp. 3-30 | Article | Zbl 0917.65005

[22] Sobol, I. Sensitivity analysis for non-linear mathematical models, Mathematical Modeling and Computational Experiment, Volume 1 (1993), pp. 407-414 | MR 1335161 | Zbl 1039.65505

[23] Saltelli, Andrea; Ratto, Marco; Andres, Terry; Campolongo, Francesca; Cariboni, Jessica; Gatelli, Debora; Saisana, Michaela; Tarantola, Stefano Global sensitivity analysis : the primer, John Wiley & Sons, 2008 | MR 2382923 | Zbl 1161.00304

[24] Stone, Charles J. The use of polynomial splines and their tensor products in multivariate function estimation, Ann. Statist., Volume 22 (1994) no. 1, pp. 118-184 | MR 1272079 | Zbl 0827.62038

[25] Tardieu, F. Virtual plants : modelling as a tool for the genomics of tolerance to water deficit, Trends in Plant Science, Volume 8 (2003) no. 1, pp. 9-14

[26] Zou, H; Hastie, T Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society, Series B, Volume 67 (2005), pp. 301-320 | MR 2137327 | Zbl 1069.62054