Closed-form Bayesian inference of graphical model structures by averaging over trees
Journal de la société française de statistique, Tome 160 (2019) no. 2, pp. 1-23.

Nous nous intéressons à l’inférence de la structure d’un modèle graphique non orienté dans une cadre bayésien. Pour éviter de recourir à des méthodes de Monte-Carlo coûteuses et aux problèmes de convergence associés, nous nous concentrons sur des méthodes exactes. Plus précisément, nous menons l’inférence au moyen de lois a posteriori explicites, évitant ainsi toute étape d’échantillonnage. Dans ce but, nous restreignons l’espace des graphes à des mélanges d’arbres recouvrants. Nous étudions sous quelles condition sur les lois a priori – à la fois sur les arbres et sur les paramètres – une inférence bayésienne exacte peut être menée. Dans ce cadre, nous proposons un algorithme exact et rapide permettant de calculer la probabilité a posteriori pour qu’une arête appartienne au graphe, en utilisant un résultat algébrique connu sous le nom de théorème Arbre-Matrice. Nous montrons que la restriction aux arbres n’empêche pas d’obtenir de bons résultats aussi bien sur des données simulées que sur des données issues de cytométrie de flux.

We consider the inference of the structure of an undirected graphical model in a Bayesian framework. To avoid convergence issues and highly demanding Monte Carlo sampling, we focus on exact inference. More specifically we aim at achieving the inference with closed-form posteriors, avoiding any sampling step. To this aim, we restrict the set of considered graphs to mixtures of spanning trees. We investigate under which conditions on the priors – on both tree structures and parameters – closed-form Bayesian inference can be achieved. Under these conditions, we derive a fast an exact algorithm to compute the posterior probability for an edge to belong to the tree model using an algebraic result called the Matrix-Tree theorem. We show that the assumption we have made does not prevent our approach to perform well on synthetic and flow cytometry data.

Mots clés : arbres recouvrants, hyper-Markov, modèles graphiques, théorème arbre-matrice
@article{JSFS_2019__160_2_1_0,
     author = {Schwaller, Lo{\"\i}c and Robin, St\'ephane and Stumpf, Michael},
     title = {Closed-form {Bayesian} inference of graphical model structures by averaging over trees},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {1--23},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {160},
     number = {2},
     year = {2019},
     zbl = {1432.62059},
     mrnumber = {3987787},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2019__160_2_1_0/}
}
TY  - JOUR
AU  - Schwaller, Loïc
AU  - Robin, Stéphane
AU  - Stumpf, Michael
TI  - Closed-form Bayesian inference of graphical model structures by averaging over trees
JO  - Journal de la société française de statistique
PY  - 2019
DA  - 2019///
SP  - 1
EP  - 23
VL  - 160
IS  - 2
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2019__160_2_1_0/
UR  - https://zbmath.org/?q=an%3A1432.62059
UR  - https://www.ams.org/mathscinet-getitem?mr=3987787
LA  - en
ID  - JSFS_2019__160_2_1_0
ER  - 
Schwaller, Loïc; Robin, Stéphane; Stumpf, Michael. Closed-form Bayesian inference of graphical model structures by averaging over trees. Journal de la société française de statistique, Tome 160 (2019) no. 2, pp. 1-23. http://www.numdam.org/item/JSFS_2019__160_2_1_0/

[Atay-Kayis and Massam, 2005] Atay-Kayis, A. and Massam, H. (2005). A Monte Carlo method to compute the marginal likelihood in non decomposable graphical Gaussian models. Biometrika, 92:317–335. | MR 2201362 | Zbl 1094.62028

[Burger and Van Nimwegen, 2010] Burger, L. and Van Nimwegen, E. (2010). Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Computational Biology, 6(1). | MR 2601389

[Byrne and Dawid, 2015] Byrne, S. and Dawid, A. P. (2015). Structural markov graph laws for Bayesian model uncertainty. Ann. Statist., 43(4):1647–1681. | MR 3357874

[Chaiken, 1982] Chaiken, S. (1982). A Combinatorial Proof of the All Minors Matrix Tree Theorem. SIAM Journal on Algebraic Discrete Methods, 3(3):319–329. | MR 666857 | Zbl 0495.05018

[Chow and Liu, 1968] Chow, C. and Liu, C. (1968). Approximating Discrete Probability Distributions with Dependence Trees. IEEE Transactions on Information Theory, IT-14(3):462–467. | Zbl 0165.22305

[Dawid and Lauritzen, 1993] Dawid, A. P. and Lauritzen, S. L. (1993). Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models. The Annals of Statistics, 21(3):1272–1317. | MR 1241267 | Zbl 0815.62038

[Friedman and Koller, 2003] Friedman, N. and Koller, D. (2003). Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 50:95–125. | Zbl 1033.68104

[Geiger and Heckerman, 2002] Geiger, D. and Heckerman, D. (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. The Annals of Statistics, 30(5):1412–1440. | MR 1936324 | Zbl 1016.62064

[Green and Thomas, 2013] Green, P. J. and Thomas, A. (2013). Sampling decomposable graphs using a markov chain on junction trees. Biometrika, 100(1):91–110. | MR 3034326 | Zbl 1284.62172

[Hammersley and Clifford, 1971] Hammersley, J. M. and Clifford, P. (1971). Markov field on finite graphs and lattices.

[Heckerman and Chickering, 1995] Heckerman, D. and Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. In Machine Learning, pages 20–197. | Zbl 0831.68096

[Kirshner, 2007] Kirshner, S. (2007). Learning with Tree-Averaged Densities and Distributions. Advances in Neural Information Processing Systems 2008, 20:761–768.

[Kuipers et al., 2014] Kuipers, J., Moffa, G., and Heckerman, D. (2014). Addendum on the scoring of gaussian directed acyclic graphical models. Ann. Statist., 42(4):1689–1691. | MR 3262465 | Zbl 1297.62122

[Lauritzen, 1996] Lauritzen, S. L. (1996). Graphical Models. Oxford University Press. | MR 1419991

[Lin et al., 2009] Lin, Y., Zhu, S., Leet, D. D., and Taskar, B. (2009). Learning Sparse Markov Network Structure via Ensemble-of-Trees Models. In 12th International Conference on Artificial Intelligence and Statistics (AISTATS) 2009, volume 5, pages 360–367.

[Madigan et al., 1995] Madigan, D., York, J., and Allard, D. (1995). Bayesian graphical models for discrete data. International Statistical Review, 63(2):215–232. | Zbl 0834.62003

[Meilă, 1999] Meilă, M. (1999). Learning with Mixtures of Trees. PhD thesis, Massachusetts Institute of Technology. | MR 2716859

[Meilă and Jaakkola, 2006] Meilă, M. and Jaakkola, T. (2006). Tractable Bayesian learning of tree belief networks. Statistics and Computing, 16(1):77–92. | MR 2224191

[Meilă and Jordan, 2001] Meilă, M. and Jordan, M. I. (2001). Learning with Mixtures of Trees. The Journal of Machine Learning Research, 1:1–48. | MR 1882241 | Zbl 1052.68116

[Nelsen, 2006] Nelsen, R. B. (2006). An Introduction to Copulas. Springer Series in Statistics. | MR 2197664 | Zbl 1152.62030

[Niinimäki et al., 2016] Niinimäki, T., Parviainen, P., and Koivisto, M. (2016). Structure Discovery in Bayesian Networks by Sampling Partial Orders. Journal of Machine Learning Research, 17(57):1–47. | MR 3504617

[Parviainen and Koivisto, 2009] Parviainen, P. and Koivisto, M. (2009). Exact Structure Discovery in Bayesian Networks with Less Space. Uai, pages 436–443.

[Prim, 1957] Prim, R. C. (1957). Shortest Connection Networks And Some Generalizations. Bell System Technical Journal, 36(6):1389–1401.

[Roverato, 2002] Roverato, A. (2002). Hyper inverse wishart distribution for non-decomposable graphs and its application to Bayesian inference for gaussian graphical models. Scandinavian Journal of Statistics, 29(3):391–411. | MR 1925566 | Zbl 1036.62027

[Sachs et al., 2005] Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A., and Nolan, G. P. (2005). Causal protein-signaling networks derived from multiparameter single-cell data. Science (New York, N.Y.), 308:523–529.

[Werhli et al., 2006] Werhli, A. V., Grzegorczyk, M., and Husmeier, D. (2006). Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks. Bioinformatics (Oxford, England), 22(20):2523–31.

[York and Madigan, 1992] York, J. C. and Madigan, D. (1992). Bayesian methods for estimating the size of a closed population. Technical Report 234, Department of Statistics, University of Washington, Seattle.