Closed-form Bayesian inference of graphical model structures by averaging over trees
Journal de la société française de statistique, Volume 160 (2019) no. 2, pp. 1-23.

We consider the inference of the structure of an undirected graphical model in a Bayesian framework. To avoid convergence issues and highly demanding Monte Carlo sampling, we focus on exact inference. More specifically we aim at achieving the inference with closed-form posteriors, avoiding any sampling step. To this aim, we restrict the set of considered graphs to mixtures of spanning trees. We investigate under which conditions on the priors – on both tree structures and parameters – closed-form Bayesian inference can be achieved. Under these conditions, we derive a fast an exact algorithm to compute the posterior probability for an edge to belong to the tree model using an algebraic result called the Matrix-Tree theorem. We show that the assumption we have made does not prevent our approach to perform well on synthetic and flow cytometry data.

Nous nous intéressons à l’inférence de la structure d’un modèle graphique non orienté dans une cadre bayésien. Pour éviter de recourir à des méthodes de Monte-Carlo coûteuses et aux problèmes de convergence associés, nous nous concentrons sur des méthodes exactes. Plus précisément, nous menons l’inférence au moyen de lois a posteriori explicites, évitant ainsi toute étape d’échantillonnage. Dans ce but, nous restreignons l’espace des graphes à des mélanges d’arbres recouvrants. Nous étudions sous quelles condition sur les lois a priori – à la fois sur les arbres et sur les paramètres – une inférence bayésienne exacte peut être menée. Dans ce cadre, nous proposons un algorithme exact et rapide permettant de calculer la probabilité a posteriori pour qu’une arête appartienne au graphe, en utilisant un résultat algébrique connu sous le nom de théorème Arbre-Matrice. Nous montrons que la restriction aux arbres n’empêche pas d’obtenir de bons résultats aussi bien sur des données simulées que sur des données issues de cytométrie de flux.

Keywords: graphical models, hyper Markov, matrix-tree theorem, spanning trees
Mots-clés : arbres recouvrants, hyper-Markov, modèles graphiques, théorème arbre-matrice
@article{JSFS_2019__160_2_1_0,
     author = {Schwaller, Lo{\"\i}c and Robin, St\'ephane and Stumpf, Michael},
     title = {Closed-form {Bayesian} inference of graphical model structures by averaging over trees},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {1--23},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {160},
     number = {2},
     year = {2019},
     mrnumber = {3987787},
     zbl = {1432.62059},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2019__160_2_1_0/}
}
TY  - JOUR
AU  - Schwaller, Loïc
AU  - Robin, Stéphane
AU  - Stumpf, Michael
TI  - Closed-form Bayesian inference of graphical model structures by averaging over trees
JO  - Journal de la société française de statistique
PY  - 2019
SP  - 1
EP  - 23
VL  - 160
IS  - 2
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2019__160_2_1_0/
LA  - en
ID  - JSFS_2019__160_2_1_0
ER  - 
%0 Journal Article
%A Schwaller, Loïc
%A Robin, Stéphane
%A Stumpf, Michael
%T Closed-form Bayesian inference of graphical model structures by averaging over trees
%J Journal de la société française de statistique
%D 2019
%P 1-23
%V 160
%N 2
%I Société française de statistique
%U http://www.numdam.org/item/JSFS_2019__160_2_1_0/
%G en
%F JSFS_2019__160_2_1_0
Schwaller, Loïc; Robin, Stéphane; Stumpf, Michael. Closed-form Bayesian inference of graphical model structures by averaging over trees. Journal de la société française de statistique, Volume 160 (2019) no. 2, pp. 1-23. http://www.numdam.org/item/JSFS_2019__160_2_1_0/

[Atay-Kayis and Massam, 2005] Atay-Kayis, A. and Massam, H. (2005). A Monte Carlo method to compute the marginal likelihood in non decomposable graphical Gaussian models. Biometrika, 92:317–335. | MR | Zbl

[Burger and Van Nimwegen, 2010] Burger, L. and Van Nimwegen, E. (2010). Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Computational Biology, 6(1). | MR

[Byrne and Dawid, 2015] Byrne, S. and Dawid, A. P. (2015). Structural markov graph laws for Bayesian model uncertainty. Ann. Statist., 43(4):1647–1681. | MR

[Chaiken, 1982] Chaiken, S. (1982). A Combinatorial Proof of the All Minors Matrix Tree Theorem. SIAM Journal on Algebraic Discrete Methods, 3(3):319–329. | MR | Zbl

[Chow and Liu, 1968] Chow, C. and Liu, C. (1968). Approximating Discrete Probability Distributions with Dependence Trees. IEEE Transactions on Information Theory, IT-14(3):462–467. | Zbl

[Dawid and Lauritzen, 1993] Dawid, A. P. and Lauritzen, S. L. (1993). Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models. The Annals of Statistics, 21(3):1272–1317. | MR | Zbl

[Friedman and Koller, 2003] Friedman, N. and Koller, D. (2003). Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 50:95–125. | Zbl

[Geiger and Heckerman, 2002] Geiger, D. and Heckerman, D. (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. The Annals of Statistics, 30(5):1412–1440. | MR | Zbl

[Green and Thomas, 2013] Green, P. J. and Thomas, A. (2013). Sampling decomposable graphs using a markov chain on junction trees. Biometrika, 100(1):91–110. | MR | Zbl

[Hammersley and Clifford, 1971] Hammersley, J. M. and Clifford, P. (1971). Markov field on finite graphs and lattices.

[Heckerman and Chickering, 1995] Heckerman, D. and Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. In Machine Learning, pages 20–197. | Zbl

[Kirshner, 2007] Kirshner, S. (2007). Learning with Tree-Averaged Densities and Distributions. Advances in Neural Information Processing Systems 2008, 20:761–768.

[Kuipers et al., 2014] Kuipers, J., Moffa, G., and Heckerman, D. (2014). Addendum on the scoring of gaussian directed acyclic graphical models. Ann. Statist., 42(4):1689–1691. | MR | Zbl

[Lauritzen, 1996] Lauritzen, S. L. (1996). Graphical Models. Oxford University Press. | MR

[Lin et al., 2009] Lin, Y., Zhu, S., Leet, D. D., and Taskar, B. (2009). Learning Sparse Markov Network Structure via Ensemble-of-Trees Models. In 12th International Conference on Artificial Intelligence and Statistics (AISTATS) 2009, volume 5, pages 360–367.

[Madigan et al., 1995] Madigan, D., York, J., and Allard, D. (1995). Bayesian graphical models for discrete data. International Statistical Review, 63(2):215–232. | Zbl

[Meilă, 1999] Meilă, M. (1999). Learning with Mixtures of Trees. PhD thesis, Massachusetts Institute of Technology. | MR

[Meilă and Jaakkola, 2006] Meilă, M. and Jaakkola, T. (2006). Tractable Bayesian learning of tree belief networks. Statistics and Computing, 16(1):77–92. | MR

[Meilă and Jordan, 2001] Meilă, M. and Jordan, M. I. (2001). Learning with Mixtures of Trees. The Journal of Machine Learning Research, 1:1–48. | MR | Zbl

[Nelsen, 2006] Nelsen, R. B. (2006). An Introduction to Copulas. Springer Series in Statistics. | MR | Zbl

[Niinimäki et al., 2016] Niinimäki, T., Parviainen, P., and Koivisto, M. (2016). Structure Discovery in Bayesian Networks by Sampling Partial Orders. Journal of Machine Learning Research, 17(57):1–47. | MR

[Parviainen and Koivisto, 2009] Parviainen, P. and Koivisto, M. (2009). Exact Structure Discovery in Bayesian Networks with Less Space. Uai, pages 436–443.

[Prim, 1957] Prim, R. C. (1957). Shortest Connection Networks And Some Generalizations. Bell System Technical Journal, 36(6):1389–1401.

[Roverato, 2002] Roverato, A. (2002). Hyper inverse wishart distribution for non-decomposable graphs and its application to Bayesian inference for gaussian graphical models. Scandinavian Journal of Statistics, 29(3):391–411. | MR | Zbl

[Sachs et al., 2005] Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A., and Nolan, G. P. (2005). Causal protein-signaling networks derived from multiparameter single-cell data. Science (New York, N.Y.), 308:523–529.

[Werhli et al., 2006] Werhli, A. V., Grzegorczyk, M., and Husmeier, D. (2006). Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks. Bioinformatics (Oxford, England), 22(20):2523–31.

[York and Madigan, 1992] York, J. C. and Madigan, D. (1992). Bayesian methods for estimating the size of a closed population. Technical Report 234, Department of Statistics, University of Washington, Seattle.