--- title: "Stochastic Block Models for Multiplex networks" subtitle: "War and Alliance case study" author: "team großBM" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 4 bibliography: references.bib link-citations: yes vignette: > %\VignetteIndexEntry{Stochastic Block Models for Multiplex networks : application} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Preliminaries This vignette illustrates the use of the `estimateMultiplexSBM` function and the methods accompanying the R6 classes `multiplexSBMfit` on the `war` data set. ### Requirements The packages required for the analysis are **sbm** plus some others for data manipulation and representation: ```{r setup, message=FALSE, warning=FALSE} library(sbm) library(igraph) library(aricode) ``` ### Data set The `war` data set comes in the `sbm` package: ```{r load-data-set} data("war") ``` This data set contains a list of two networks (`belligerent` and `alliance`) where the nodes are countries; an edge in the network `belligerent` means that the two countries have been at war at least once between years 1816 to 2007; an edge in network `alliance` means that the two countries have had a formal alliance between years 1816 and 2012. The network `belligerent` have less nodes since countries which have not been at war at all are not considered. These two networks were extracted from [https://correlatesofwar.org/](https://correlatesofwar.org/) (see @sarkees2010resort for war data, and @gibler2008international for formal alliance). Version 4.0 was used for war data and version 4.1 for formal alliance. # Data manipulation Since they don't have the same size, we choose to only consider nodes (countries) which were at war with at least one other country. This corresponds to the first 83 nodes in the Alliance network. ```{r manipulation} A <- as.matrix(get.adjacency(war$alliance)) A <- A[1:83, 1:83] B <- as.matrix(get.adjacency(war$belligerent)) ``` We can start with a plot of this multiplex network: ```{r} netA <- defineSBM(A, model = "bernoulli", dimLabels = "country") netB <- defineSBM(B, model = "bernoulli", dimLabels = "country") plotMyMultiplexMatrix(list(netA, netB)) ``` # Fitting a multiplex SBM model where the two layers are assumed to be independent We run the estimation of this multiplex model. By setting `dependent=FALSE`, we declare that we consider the two layers to be independent conditionally on the latent block variables. ```{r, echo = FALSE, results='hide'} MultiplexFitIndep <- readRDS("Multiplex_allianceNwar_case_study.rds") ``` ```{r, eval = FALSE} MultiplexFitIndep <- estimateMultiplexSBM(list(netA, netB), dependent = FALSE, estimOptions = list(verbosity = 0)) ``` We can retrieve the clustering ```{r} clust_country_indep <- MultiplexFitIndep$memberships[[1]] sort(clust_country_indep) ``` And we can plot the reorganized adjacency matrices or the corresponding expectations: ```{r} plot(MultiplexFitIndep) plot(MultiplexFitIndep, type = "expected") ``` # Fitting a multiplex SBM model where the two layers are assumed to be dependent Now we assume that the two layers are dependent conditionally to the latent block variables. We then set `dependent = TRUE` ```{r} MultiplexFitdep <- estimateMultiplexSBM(list(netA, netB), dependent = TRUE, estimOptions = list(verbosity = 0)) ``` We can retrieve the clustering and compare it to the one obtained in the independent case. ```{r} clust_country_dep <- MultiplexFitdep$memberships[[1]] sort(clust_country_indep) aricode::ARI(clust_country_indep, clust_country_dep) ``` On top of the clustering comparison, we can compare the ICL criterion to see which of the dependent or independent models is a best fit: ```{r} MultiplexFitdep$ICL MultiplexFitIndep$ICL ``` We can do the same plots for the reorganized matrices and the corresponding expectations. Note that the expectations correspond to the marginal expectation of each layer and it may be relevant to have a look at the conditional expectations. ```{r} plot(MultiplexFitdep) plot(MultiplexFitdep, type = "expected") ``` For instance, we may want to compare the marginal distribution for two countries in their given blocks to have being at war while they are allied at some point (before or after): ```{r} p11 <- MultiplexFitdep$connectParam$prob11 p01 <- MultiplexFitdep$connectParam$prob01 p10 <- MultiplexFitdep$connectParam$prob10 # conditional probabilities of being at war while having been or will # be allied round(p11/(p11 + p10), 2) # marginal probabilities of being at war round(p11 + p01, 2) ``` ## References