Numéro spécial : analyse de mélanges
Exact integrated completed likelihood maximisation in a stochastic block transition model for dynamic networks
[Maximisation d’un critère exact de classification pour un modèle des blocs latents pour les réseaux dynamiques]
Journal de la société française de statistique, Tome 160 (2019) no. 1, pp. 35-56.

Le modèle des blocs latents est un modèle statistique largement utilisé et très flexible. Les extensions de ce modèle à l’analyse des réseaux dynamiques ne peut pas capturer la persistance des liens dans les temps contigus. Le modèle des blocs latents avec des transitions aborde cette question et modélise la propension à créer et à maintenir les liens dans les temps. On présente ici une extension bayésienne de ce modèle et une nouvelle méthodologie pour la classification des nœuds. La méthode repose sur une procédure d’optimisation afin de maximiser un critère exact de classification. L’algorithme est très efficace et rend la méthodologie appropriée pour l’analyse de grands ensembles de données de réseaux. De plus, l’algorithme sélectionne le nombre optimal de groupes latents sans aucun coût supplémentaire. L’efficacité de la méthode est démontrée par des applications à des ensembles de données artificielles et réelles.

The latent stochastic block model is a flexible and widely used statistical model for the analysis of network data. Extensions of this model to a dynamic context often fail to capture the persistence of edges in contiguous network snapshots. The recently introduced stochastic block transition model addresses precisely this issue, by modelling the probabilities of creating a new edge and of maintaining an edge over time. Using a model-based clustering approach, this paper illustrates a methodology to fit stochastic block transition models under a Bayesian framework. The method relies on a greedy optimisation procedure to maximise the exact integrated completed likelihood. The computational efficiency of the algorithm used makes the methodology scalable and appropriate for the analysis of large network datasets. Crucially, the optimal number of latent groups is automatically selected at no additional computing cost. The efficacy of the method is demonstrated through applications to both artificial and real datasets.

Mots clés : modèle des blocs latents avec des transitions, réseaux dynamiques, vraisemblance complétée intégrée, algorithme glouton, partitionnement de données
@article{JSFS_2019__160_1_35_0,
     author = {Rastelli, Riccardo},
     title = {Exact integrated completed likelihood maximisation in a stochastic block transition model for dynamic networks},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {35--56},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {160},
     number = {1},
     year = {2019},
     zbl = {1423.90041},
     mrnumber = {3928539},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2019__160_1_35_0/}
}
TY  - JOUR
AU  - Rastelli, Riccardo
TI  - Exact integrated completed likelihood maximisation in a stochastic block transition model for dynamic networks
JO  - Journal de la société française de statistique
PY  - 2019
DA  - 2019///
SP  - 35
EP  - 56
VL  - 160
IS  - 1
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2019__160_1_35_0/
UR  - https://zbmath.org/?q=an%3A1423.90041
UR  - https://www.ams.org/mathscinet-getitem?mr=3928539
LA  - en
ID  - JSFS_2019__160_1_35_0
ER  - 
Rastelli, Riccardo. Exact integrated completed likelihood maximisation in a stochastic block transition model for dynamic networks. Journal de la société française de statistique, Tome 160 (2019) no. 1, pp. 35-56. http://www.numdam.org/item/JSFS_2019__160_1_35_0/

[1] Biernacki, C.; Celeux, G.; Govaert, G. Assessing a mixture model for clustering with the integrated completed likelihood, Pattern Analysis and Machine Intelligence, IEEE Transactions on, Volume 22 (2000) no. 7, pp. 719-725

[2] Bertoletti, M.; Friel, N.; Rastelli, R. Choosing the number of clusters in a finite mixture model using an exact integrated completed likelihood criterion, Metron, Volume 73 (2015) no. 2, pp. 177-199 | MR 3386216 | Zbl 1329.62277

[3] Bartolucci, F.; Marino, M. F.; Pandolfi, S. Dealing with reciprocity in dynamic stochastic block models, Computational Statistics & Data Analysis, Volume 123 (2018), pp. 86-100 | MR 3777087 | Zbl 06920683

[4] Côme, E.; Latouche, P. Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood, Statistical Modelling, Volume 15 (2015) no. 6, pp. 564-589 | MR 3441229 | Zbl 07259003

[5] Corneli, M.; Latouche, P.; Rossi, F. Multiple change points detection and clustering in dynamic networks, Statistics and Computing, Volume 28 (2018) no. 5, pp. 989-1007 | MR 3835630 | Zbl 1405.62076

[6] Daudin, J. J.; Picard, F.; Robin, S. A mixture model for random graphs, Statistics and Computing, Volume 18 (2008) no. 2, pp. 173-183 | MR 2390817

[7] Eagle, N.; Pentland, A. S. Reality mining: sensing complex social systems, Personal and ubiquitous computing, Volume 10 (2006) no. 4, pp. 255-268

[8] Fortunato, S. Community detection in graphs, Physics reports, Volume 486 (2010) no. 3, pp. 75-174 | MR 2580414

[9] Friel, N.; Rastelli, R.; Wyse, J.; Raftery, A. E. Interlocking directorates in Irish companies using a latent space model for bipartite networks, Proceedings of the National Academy of Sciences, Volume 113 (2016) no. 24, pp. 6629-6634

[10] Heaukulani, C.; Ghahramani, Z. Dynamic probabilistic models for latent feature propagation in social networks, International Conference on Machine Learning (2013), pp. 275-283

[11] Holland, P. W.; Laskey, K. B.; Leinhardt, S. Stochastic blockmodels: First steps, Social networks, Volume 5 (1983) no. 2, pp. 109-137 | MR 718088

[12] Hoff, P. D.; Raftery, A. E.; Handcock, M. S. Latent space approaches to social network analysis, Journal of the American Statistical Association, Volume 97 (2002) no. 460, pp. 1090-1098 | MR 1951262 | Zbl 1041.62098

[13] Ishiguro, K.; Iwata, T.; Ueda, N.; Tenenbaum, J. B. Dynamic infinite relational model for time-varying relational data analysis, Advances in Neural Information Processing Systems (2010), pp. 919-927

[14] Karrer, B.; Newman, M. E. J. Stochastic blockmodels and community structure in networks, Physical Review E, Volume 83 (2011) no. 1 | Article | MR 2788206

[15] Latouche, P.; Birmele, E.; Ambroise, C. Variational Bayesian inference and complexity control for stochastic block models, Statistical Modelling, Volume 12 (2012) no. 1, pp. 93-115 | MR 2953099 | Zbl 1420.62114

[16] Matias, C.; Miele, V. Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (Statistical Methodology), Volume 79 (2017) no. 4, pp. 1119-1141 | MR 3689311 | Zbl 1373.62312

[17] McDaid, A. F.; Murphy, T. B.; Friel, N.; Hurley, N. J. Improved Bayesian inference for the stochastic block model with application to large networks, Computational Statistics & Data Analysis, Volume 60 (2013), pp. 12-31 | MR 3007016 | Zbl 1365.62241

[18] Matias, C.; Robin, S. Modeling heterogeneity in random graphs through latent space models: a selective review, ESAIM: Proceedings and Surveys, Volume 47 (2014), pp. 55-74 | MR 3419385 | Zbl 1335.05002

[19] Matias, C.; Rebafka, T.; Villers, F. A semiparametric extension of the stochastic block model for longitudinal networks, Biometrika, Volume 105 (2018) no. 3, pp. 665-680 | MR 3842891 | Zbl 06991025

[20] Malsiner-Walli, G.; Frühwirth-Schnatter, S.; Grün, B. Model-based clustering based on sparse finite Gaussian mixtures, Statistics and computing, Volume 26 (2016) no. 1-2, pp. 303-324 | MR 3439375 | Zbl 1342.62109

[21] Nobile, A.; Fearnside, A. T. Bayesian finite mixtures with an unknown number of components: the allocation sampler, Statistics and Computing, Volume 17 (2007) no. 2, pp. 147-162 | MR 2380643

[22] Nowicki, K.; Snijders, T. A. B. Estimation and prediction for stochastic blockstructures, Journal of the American Statistical Association, Volume 96 (2001) no. 455, pp. 1077-1087 | MR 1947255 | Zbl 1072.62542

[23] R Core Team R: A Language and Environment for Statistical Computing (2017) https://www.R-project.org/

[24] Rastelli, R.; Latouche, P.; Friel, N. Choosing the number of groups in a latent stochastic blockmodel for dynamic networks, Network Science (to appear) (2018) (https://doi.org/10.1017/nws.2018.19) | Article

[25] Rousseau, J.; Mengersen, K. Asymptotic behaviour of the posterior distribution in overfitted mixture models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), Volume 73 (2011) no. 5, pp. 689-710 | MR 2867454 | Zbl 1228.62034

[26] Strehl, A.; Ghosh, J. Cluster ensembles—a knowledge reuse framework for combining multiple partitions, Journal of machine learning research, Volume 3 (2002) no. Dec, pp. 583-617 | MR 1991087 | Zbl 1084.68759

[27] Wyse, J.; Friel, N. Block clustering with collapsed latent block models, Statistics and Computing, Volume 22 (2012) no. 2, pp. 415-428 | MR 2865026 | Zbl 1322.62046

[28] Wyse, J.; Friel, N.; Latouche, P. Inferring structure in bipartite networks using the latent blockmodel and exact ICL, Network Science, Volume 5 (2017) no. 1, pp. 45-69

[29] Wang, Y. J.; Wong, G. Y. Stochastic blockmodels for directed graphs, Journal of the American Statistical Association, Volume 82 (1987) no. 397, pp. 8-19 | MR 883333 | Zbl 0613.62146

[30] White, A.; Wyse, J.; Murphy, T. B. Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler, Statistics and Computing, Volume 26 (2016) no. 1-2, pp. 511-527 | MR 3439388 | Zbl 1342.62112

[31] Xu, K. S.; Hero, A. O. Dynamic stochastic blockmodels for time-evolving social networks, Selected Topics in Signal Processing, IEEE Journal of, Volume 8 (2014) no. 4, pp. 552-562

[32] Xu, K. Stochastic block transition models for dynamic networks, Artificial Intelligence and Statistics (2015), pp. 1079-1087

[33] Yang, T.; Chi, Y.; Zhu, S.; Gong, Y.; Jin, R. Detecting communities and their evolutions in dynamic social networks – a Bayesian approach, Machine learning, Volume 82 (2011) no. 2, pp. 157-189 | MR 3108191 | Zbl 1237.91189

[34] Zhang, X.; Moore, C.; Newman, M. E. J. Random graph models for dynamic networks, The European Physical Journal B, Volume 90 (2017) no. 10 | Article | MR 3713556