Bayesian selection of mixed covariates from a latent layer: application to hierarchical modeling of soil carbon dynamics
[Sélection bayésienne de covariables mixtes sur la couche latente d’un modèle hiérarchique : application à la dynamique de carbone dans le sol]
Journal de la société française de statistique, Tome 159 (2018) no. 2, pp. 128-155.

Le carbone du sol est important non seulement pour assurer la sécurité alimentaire en maintenant la fertilité des sols, mais aussi pour limiter le réchauffement climatique en augmentant la séquestration du carbone dans le sol. Il est urgent de comprendre la réaction du carbone du sol face au réchauffement climatique et au changement des pratiques agricoles. Des modèles bio-physiques ont été développés depuis quelques décennies pour étudier la matière organique du sol (SOM). Cependant, il existe encore une forte incertitude sur les mécanismes contrôlant la dynamique de la SOM, du niveau microbien aux échelles globales. Dans cet article, nous proposons une approche statistique bayésienne de sélection de variables pour mieux cerner la dynamique du carbone du sol en examinant la variation en profondeur du radiocarbone pour 159 profils sous différentes conditions de climat (température, précipitations, ...) et d’environnement (type de sol, type d’usage du sol, ...). La recherche stochastique de sélection de variables (SSVS) est appliquée au niveau des variables latentes d’un modèle bayésien hiérarchique. Ce modèle décrit la variation du radiocarbone en fonction de la profondeur et en tenant compte des covariables explicatives potentielles tels que les facteurs climatiques et environnementaux. Cette approche nous permet d’avoir un jugement probabiliste sur la contribution conjointe du type de sol, du climat et de l’usage du sol à la dynamique verticale du carbone dans le sol. Nous discutons également de la performance pratique et des limitations de SSVS en présence de covariables catégorielles et de la colinéarité entre certaines covariables quand elles interviennent au niveau d’une couche latente d’un modèle bayésien hiérarchique.

Soil carbon is important not only to ensure food security via soil fertility, but also to potentially mitigate global warming via increasing soil carbon sequestration. There is an urgent need to understand the response of the soil carbon pool to climate change and agricultural practices. Biophysical models have been developed to study Soil Organic Matter (SOM) for some decades. However, there still remains considerable uncertainty about the mechanisms that affect SOM dynamics from the microbial level to global scales. In this paper, we propose a statistical Bayesian selection approach to study which forcing conditions influence soil carbon dynamics by looking at the depth distribution of radiocarbon content for 159 profiles under different conditions of climate (temperature, precipitation, etc.) and environment (soil type, land-use). Stochastic Search Variable Selection (SSVS) is here applied to latent variables in a hierarchical Bayesian model. The model describes variations of radiocarbon content as a function of depth and potential covariates such as climatic and environmental factors. SSVS provides a probabilistic judgment about the joint contribution of soil type, climate and land use on soil carbon dynamics. We also discuss the practical performance and limitations of SSVS in presence of categorical covariates and collinearity between covariates in the latent layers of the model.

Keywords: Bayesian selection approach, SSVS, spike and slab prior, hierarchical Bayesian model, latent variables, organic carbon dynamics, radiocarbon
Mot clés : méthode bayésienne de sélection de variables, recherche stochastique de sélection de covariables, modèle hiérarchique bayésien, variables latentes, ynamique du carbone organique, radiocarbone
@article{JSFS_2018__159_2_128_0,
     author = {Jreich, Rana and Hatte, Christine and Balesdent, J\'er\^ome and Parent, \'Eric},
     title = {Bayesian selection of mixed covariates from a latent layer: application to hierarchical modeling of soil carbon dynamics},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {128--155},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {159},
     number = {2},
     year = {2018},
     mrnumber = {3855904},
     zbl = {1402.86019},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2018__159_2_128_0/}
}
TY  - JOUR
AU  - Jreich, Rana
AU  - Hatte, Christine
AU  - Balesdent, Jérôme
AU  - Parent, Éric
TI  - Bayesian selection of mixed covariates from a latent layer: application to hierarchical modeling of soil carbon dynamics
JO  - Journal de la société française de statistique
PY  - 2018
SP  - 128
EP  - 155
VL  - 159
IS  - 2
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2018__159_2_128_0/
LA  - en
ID  - JSFS_2018__159_2_128_0
ER  - 
%0 Journal Article
%A Jreich, Rana
%A Hatte, Christine
%A Balesdent, Jérôme
%A Parent, Éric
%T Bayesian selection of mixed covariates from a latent layer: application to hierarchical modeling of soil carbon dynamics
%J Journal de la société française de statistique
%D 2018
%P 128-155
%V 159
%N 2
%I Société française de statistique
%U http://www.numdam.org/item/JSFS_2018__159_2_128_0/
%G en
%F JSFS_2018__159_2_128_0
Jreich, Rana; Hatte, Christine; Balesdent, Jérôme; Parent, Éric. Bayesian selection of mixed covariates from a latent layer: application to hierarchical modeling of soil carbon dynamics. Journal de la société française de statistique, Tome 159 (2018) no. 2, pp. 128-155. http://www.numdam.org/item/JSFS_2018__159_2_128_0/

[1] Burnham, Kenneth P; Anderson, David R; Huyvaert, Kathryn P AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons, Behavioral Ecology and Sociobiology, Volume 65 (2011) no. 1, pp. 23-35

[2] Batjes, Niels H Total carbon and nitrogen in the soils of the world, European Journal of Soil Science, Volume 47 (1996) no. 2, pp. 151-163

[3] Bhat, Harish S; Kumar, Nitesh On the derivation of the bayesian information criterion, School of Natural Sciences, University of California (2010)

[4] Carvalhais, Nuno; Forkel, Matthias; Khomik, Myroslava; Bellarby, Jessica; Jung, Martin; Migliavacca, Mirco; Mu, Mingquan; Saatchi, Sassan; Santoro, Maurizio; Thurner, Martin Global covariation of carbon turnover times with climate in terrestrial ecosystems, Nature, Volume 514 (2014) no. 7521, pp. 213-217

[5] Dellaportas, Petros; Forster, Jonathan J; Ntzoufras, Ioannis Bayesian variable selection using the Gibbs sampler, Biostatistics-Basel-, Volume 5 (2000), pp. 273-286 | MR | Zbl

[6] Dellaportas, J.J.Forster P.; Ntzoufras On Bayesian model and variable selection using MCMC, Technical report, Departement of Statistics, Athens University of Economics and Business (1997) | Zbl

[7] Gelman, Andrew; Carlin, John B; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B Bayesian data analysis, CRC press, 2013 | MR | Zbl

[8] George, Edward I; McCulloch, Robert E Variable selection via Gibbs sampling, Journal of the American Statistical Association, Volume 88 (1993) no. 423, pp. 881-889

[9] He, Yujie; Trumbore, Susan E; Torn, Margaret S; Harden, Jennifer W; Vaughn, Lydia JS; Allison, Steven D; Randerson, James T Radiocarbon constraints imply reduced carbon uptake by soils during the 21st century, Science, Volume 353 (2016) no. 6306, pp. 1419-1424

[10] Kuo, Lynn; Mallick, Bani Variable selection for regression models, Sankhyā: The Indian Journal of Statistics, Series B (1998), pp. 65-81 | MR | Zbl

[11] Liang, Feng; Paulo, Rui; Molina, German; Clyde, Merlise A; Berger, Jim O Mixtures of g priors for Bayesian variable selection, Journal of the American Statistical Association, Volume 103 (2008) no. 481, pp. 410-423 | MR | Zbl

[12] Mathieu, Jordane A; Hatté, Christine; Balesdent, Jérôme; Parent, Éric Deep soil carbon dynamics are driven more by soil type than by climate: a worldwide meta-analysis of radiocarbon profiles, Global Change Biology, Volume 21 (2015) no. 11, pp. 4278-4292

[13] Martin, MP; Wattenbach, M; Smith, P; Meersmans, Jeroen; Jolivet, Claudy; Boulonne, Line; Arrouays, Dominique Spatial distribution of soil organic carbon stocks in France, Biogeosciences (2011)

[14] Ntzoufras, Ioannis Gibbs variable selection using BUGS, Journal of statistical software, Volume 7 (2002) no. 7, pp. 1-19

[15] O’Hara, Robert B; Sillanpää, Mikko J A review of Bayesian variable selection methods: what, how and which, Bayesian analysis, Volume 4 (2009) no. 1, pp. 85-117 | MR | Zbl

[16] Piironen, Juho; Vehtari, Aki Comparison of Bayesian predictive methods for model selection, Statistics and Computing, Volume 27 (2017) no. 3, pp. 711-735 | MR | Zbl

[17] Pauger, Daniela; Wagner, Helga Bayesian Effect Fusion for Categorical Predictors, Preprint arXiv:1703.10245 (2017) | MR | Zbl

[18] Reimer, Paula J IntCal04, Radiocarbon, Volume 46 (2004) no. 3, pp. 1029-1058

[19] Spiegelhalter, David J; Best, Nicola G; Carlin, Bradley P; Van Der Linde, Angelika Bayesian measures of model complexity and fit, Journal of the Royal Statistical Society: Series B (Statistical Methodology), Volume 64 (2002) no. 4, pp. 583-639 | Zbl

[20] Scharpenseel, HW Radiocarbon dating of soils–problems, troubles, hopes, Paleopedology: Origin, Nature and Dating of Paleosols. papers (1971)

[21] Stocker, Thomas Climate change 2013: the physical science basis: Working Group I contribution to the Fifth assessment report of the Intergovernmental Panel on Climate Change, Cambridge University Press, 2014

[22] Trabacco; Zomer Global aridity index (global-aridity) and global potential evapo-transpiration (global-PET) geospatial database, CGIAR Consortium for Spatial Information (2009)

[23] Xu, Xiaofan; Ghosh, Malay Bayesian variable selection and estimation for group Lasso, Bayesian Analysis, Volume 10 (2015) no. 4, pp. 909-936 | Zbl

[24] Zellner, Arnold On assessing prior distributions and Bayesian regression analysis with g-prior distributions, Bayesian inference and decision techniques (1986) | Zbl