For this lab, we’ll use a few different packages for data and analysis. We’re going to use the ison_m182
dataset from last week, now directly from the package.
suppressPackageStartupMessages(library(migraph)) # note that you may need a special version for what follows...
data("ison_m182", package = "migraph")
The network is anonymous, but I think it would be nice to add some names, even if it’s just pretend. Luckily, I’ve added a function for this. This makes plotting the network just a wee bit more accessible:
<- to_named(ison_m182)
ison_m182 autographr(ison_m182)
There are actually three different types of tie here. Let’s separate them out into separate networks.
<- to_uniplex(ison_m182, "friend_tie"))
(m182_friend #> # A tbl_graph: 16 nodes and 62 edges
#> #
#> # A directed simple graph with 3 components
#> #
#> # Node Data: 16 × 1 (active)
#> name
#> <chr>
#> 1 Lucille
#> 2 Kelsey
#> 3 Melody
#> 4 Christopher
#> 5 Angelica
#> 6 Hadley
#> # … with 10 more rows
#> #
#> # Edge Data: 62 × 3
#> from to friend_tie
#> <int> <int> <dbl>
#> 1 2 1 1
#> 2 2 7 1
#> 3 2 8 1
#> # … with 59 more rows
<- autographr(m182_friend) + ggtitle("Friendship")
gfriend <- to_uniplex(ison_m182, "social_tie"))
(m182_social #> # A tbl_graph: 16 nodes and 129 edges
#> #
#> # A directed simple graph with 1 component
#> #
#> # Node Data: 16 × 1 (active)
#> name
#> <chr>
#> 1 Lucille
#> 2 Kelsey
#> 3 Melody
#> 4 Christopher
#> 5 Angelica
#> 6 Hadley
#> # … with 10 more rows
#> #
#> # Edge Data: 129 × 3
#> from to social_tie
#> <int> <int> <dbl>
#> 1 1 5 1.2
#> 2 1 8 0.15
#> 3 1 9 2.85
#> # … with 126 more rows
<- autographr(m182_social) + ggtitle("Social")
gsocial <- to_uniplex(ison_m182, "task_tie"))
(m182_task #> # A tbl_graph: 16 nodes and 88 edges
#> #
#> # A directed simple graph with 1 component
#> #
#> # Node Data: 16 × 1 (active)
#> name
#> <chr>
#> 1 Lucille
#> 2 Kelsey
#> 3 Melody
#> 4 Christopher
#> 5 Angelica
#> 6 Hadley
#> # … with 10 more rows
#> #
#> # Edge Data: 88 × 3
#> from to task_tie
#> <int> <int> <dbl>
#> 1 1 5 0.3
#> 2 1 9 0.3
#> 3 1 10 0.3
#> # … with 85 more rows
<- autographr(m182_task) + ggtitle("Task")
gtask grid.arrange(gfriend, gsocial, gtask, ncol = 3)
Where could innovation occur in these networks? Let’s take a look at which actors are least constrained by their position in the task network to begin with. {migraph}
makes this easy enough with the constraint()
function.
node_constraint(m182_task)
#> Lucille Kelsey Melody Christopher Angelica Hadley
#> 0.4158571 0.4572800 0.5627214 0.6109751 0.3712394 0.4059926
#> Jessica Henry Wallace Bentley Carolyn Marlene
#> 0.5432901 0.5173988 0.5110928 0.3814114 0.5627214 0.5244070
#> Chandler Caitlyn Monique Alberta
#> 0.4763565 0.4089266 0.4514752 0.2159461
We see that this function returns a vector of constraint scores that may range between 0 and 1. Let’s size the nodes according to this score, and identify the node with the minimum constraint score. Why minimum? and what can we learn from this plot about where innovation might occur within this network?
ggidentify(m182_task, node_constraint, min)
#> Using `stress` as default layout
We’ll use the “task” and “social” sub-graphs together as the basis for structural equivalence. Before we can see how similar nodes are, we’ll have to convert them to adjacency matrices.
dim(node_tie_census(ison_m182))
#> [1] 96 16
<- node_tie_census(ison_m182) structural_combo
The result is a weighted matrix. What would you do if you wanted it to be binary?
First of all, we need to cluster nodes by their structural equivalence. In summary, we’re going to hierarchically cluster the nodes based on the distances in dissimilarity in their out- and ingoing ties. Or you can just run the following line:
<- cluster_structural_equivalence(structural_combo))
(str_res #>
#> Call:
#> stats::hclust(d = distances)
#>
#> Cluster method : complete
#> Number of objects: 16
This object doesn’t seem to tell us much, but we can investigate it more using {migraph}
’s ggtree()
. This is a dendrogram of the hierarchical clustering object. Basically, as we move to the right, we’re allowing for more and more dissimilarity among those we cluster together. A fork or branching point indicates the level of dissimilarity at which those two or more nodes would be said to be equivalent.
ggtree(str_res)
ggtree(str_res, 2) # for example let's say there are just two main clusters
ggtree(str_res, 4) # or four? what are we seeing here?
Ok, so we can draw a line and this establishes how many clusters we have (or vice versa), but also which nodes belong to which cluster. But how many clusters should we pick?
To establish that, we need to iterate through all of our options, calculating for each how correlated this pattern is with the observed network. We then plot this and, using the “elbow method”, decide how many clusters.
ggidentify_clusters(str_res, structural_combo)
When there is one cluster for each vertex in the network, cell values will be identical to the observed correlation matrix, and when there is one cluster for the whole network, the values will all be equal to the average correlation across the observed matrix. So the correlations in each by-cluster matrix are correlated with the observed correlation matrix to see how well each by-cluster matrix fits the data.
Ok, so it looks here as if there is a clear bend in the elbow/knee at four clusters. This is reasonably parsimonious and well-fitting. More clusters than this only distinguishes nodes that are less dissimilar.
<- cutree(str_res, 4))
(str_clu #> Lucille Kelsey Melody Christopher Angelica Hadley
#> 1 2 3 1 3 3
#> Jessica Henry Wallace Bentley Carolyn Marlene
#> 2 2 1 1 3 1
#> Chandler Caitlyn Monique Alberta
#> 2 2 1 4
We can use cutree()
to cut the tree at our desired point and return the resulting vector of cluster assignments. This we can use for various things. Most immediately, we may wish to see these cluster assignments mapped onto our networks. All we need to do is add the variable to existing networks and plot them:
<- m182_task %>% as_tidygraph() %>% mutate(clu = str_clu)
m182_task autographr(m182_task, node_color = "clu") + ggtitle("Task")
<- m182_social %>% as_tidygraph() %>% mutate(clu = str_clu)
m182_social autographr(m182_social, node_color = "clu") + ggtitle("Social")
<- m182_friend %>% as_tidygraph() %>% mutate(clu = str_clu)
m182_friend autographr(m182_friend, node_color = "clu") + ggtitle("Friend")
Now we can use the 4-cluster solution to generate blockmodels. ‘sna’ is required for this, but has already been loaded by ‘NetCluster’. We’ll do this on the valued network, but binary is possible too.
<- blockmodel(m182_task, str_clu))
(task_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Lucille Jessica Marlene Kelsey Chandler Caitlyn
#> 1 2 3 1 3 3
#> Henry Wallace Melody Christopher Monique Angelica
#> 2 2 1 1 3 1
#> Bentley Carolyn Hadley Alberta
#> 2 2 1 4
#>
#> Reduced form blockmodel:
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#> Block 1 Block 2 Block 3 Block 4
#> Block 1 0.70000000 0.00 0.08333333 1
#> Block 2 0.03333333 0.85 0.10000000 1
#> Block 3 0.08333333 0.05 1.00000000 1
#> Block 4 1.00000000 1.00 1.00000000 NaN
plot(task_blockmodel)
<- blockmodel(m182_friend, str_clu))
(friend_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Lucille Jessica Marlene Kelsey Chandler Caitlyn
#> 1 2 3 1 3 3
#> Henry Wallace Melody Christopher Monique Angelica
#> 2 2 1 1 3 1
#> Bentley Carolyn Hadley Alberta
#> 2 2 1 4
#>
#> Reduced form blockmodel:
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#> Block 1 Block 2 Block 3 Block 4
#> Block 1 0.4666667 0.03333333 0.04166667 0
#> Block 2 0.1000000 0.90000000 0.10000000 0
#> Block 3 0.3333333 0.15000000 1.00000000 0
#> Block 4 0.0000000 0.00000000 0.00000000 NaN
plot(friend_blockmodel)
<- blockmodel(m182_social, str_clu))
(social_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Lucille Jessica Marlene Kelsey Chandler Caitlyn
#> 1 2 3 1 3 3
#> Henry Wallace Melody Christopher Monique Angelica
#> 2 2 1 1 3 1
#> Bentley Carolyn Hadley Alberta
#> 2 2 1 4
#>
#> Reduced form blockmodel:
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#> Block 1 Block 2 Block 3 Block 4
#> Block 1 0.7333333 0.2666667 0.375 0.8333333
#> Block 2 0.2333333 1.0000000 0.400 1.0000000
#> Block 3 0.3750000 0.4000000 1.000 0.2500000
#> Block 4 1.0000000 1.0000000 1.000 NaN
plot(social_blockmodel)
What do these plots show?
Finally, we can reduce the graph to just interactions between roles. Let’s start off by graphing the valued/weighted blockmodel.
<- reduce_graph(social_blockmodel, c("Freaks","Squares","Nerds","Geek")))
(social_reduced #> IGRAPH 87f1d65 DNW- 4 15 --
#> + attr: name (v/c), weight (e/n)
#> + edges from 87f1d65 (vertex names):
#> [1] Freaks ->Freaks Freaks ->Squares Freaks ->Nerds Freaks ->Geek
#> [5] Squares->Freaks Squares->Squares Squares->Nerds Squares->Geek
#> [9] Nerds ->Freaks Nerds ->Squares Nerds ->Nerds Nerds ->Geek
#> [13] Geek ->Freaks Geek ->Squares Geek ->Nerds
autographr(social_reduced)
<- reduce_graph(task_blockmodel, c("Freaks","Squares","Nerds","Geek")))
(task_reduced #> IGRAPH 9fb22a7 DNW- 4 14 --
#> + attr: name (v/c), weight (e/n)
#> + edges from 9fb22a7 (vertex names):
#> [1] Freaks ->Freaks Freaks ->Nerds Freaks ->Geek Squares->Freaks
#> [5] Squares->Squares Squares->Nerds Squares->Geek Nerds ->Freaks
#> [9] Nerds ->Squares Nerds ->Nerds Nerds ->Geek Geek ->Freaks
#> [13] Geek ->Squares Geek ->Nerds
autographr(task_reduced)
<- reduce_graph(friend_blockmodel, c("Freaks","Squares","Nerds","Geek")))
(friend_reduced #> IGRAPH f6e0297 DNW- 4 9 --
#> + attr: name (v/c), weight (e/n)
#> + edges from f6e0297 (vertex names):
#> [1] Freaks ->Freaks Freaks ->Squares Freaks ->Nerds Squares->Freaks
#> [5] Squares->Squares Squares->Nerds Nerds ->Freaks Nerds ->Squares
#> [9] Nerds ->Nerds
autographr(friend_reduced)
# ADVANCED: Note on deductive clustering:
# It's pretty straightforward to alter the code above to test hypotheses.
# Simply supply your own cluster vector, where the elements in the vector are in
# the same order as the vertices in the matrix, and the values represent the
# cluster to which each vertex belongs.
task_social_cors <- cor(task_social)
# For example, if you believed that actors 2, 7, and 8 formed one group,
# actor 16 former another group, and everyone else formed a third group,
# you could represent this as follows:
dedclust = c(1, 2, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 3)
# Then examine the fitness of this cluster configuration as follows:
dedclust_mat <- NetCluster::generate_cluster_cor_mat(task_social_cors, dedclust)
dedclust_mat
gcor(dedclust_mat, task_social_cors)
We’re going to use the same pair of networks as with structural equivalence. But this time we’re not going to get the correlation of ties, but rather the correlation of profiles/patterns of local configurations. How can we identify patterns of local configurations? We’ll measure these profiles in terms of triad counts.
Now, there is a function for calculating triad censuses:
graph_triad_census(m182_task))
(#> 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210 300
#> 133 38 212 2 0 1 14 13 0 0 93 1 0 0 11 42
But as you can see, it just gives a aggregated tally for the whole network and not one differentiated by actor (which is what we need). Fortunately, {migraph}
offers a node-level triad census too.
# (By putting parentheses around this command, it'll assign AND print!)
<- node_triad_census(m182_task))
(task_triads #> 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210 300
#> [1,] 36 9 45 0 0 0 0 1 0 0 3 0 0 0 4 7
#> [2,] 45 10 38 0 0 0 1 1 0 0 1 0 0 0 3 6
#> [3,] 55 0 44 0 0 0 0 0 0 0 0 0 0 0 0 6
#> [4,] 102 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [5,] 55 0 37 0 0 0 2 1 0 0 4 0 0 0 0 6
#> [6,] 65 19 10 0 0 0 5 0 0 0 3 1 0 0 2 0
#> [7,] 91 1 13 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [8,] 91 1 13 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [9,] 90 1 14 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [10,] 86 1 18 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [11,] 95 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [12,] 90 1 14 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [13,] 93 1 11 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [14,] 90 3 12 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [15,] 88 2 15 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [16,] 0 0 0 0 0 0 0 0 0 0 73 0 0 0 6 26
Can you recall what these MAD codes mean? MAN might be easier to remember, for NULL dyads is the last, but MAD is probably more appropriate. ?igraph::triad.census
can be used to check what each of the MAD codes means.
As with the structural equivalence, we can simply run our function and return an object that has hierarchically clustered our nodes, but this time it will be based on their (dis)similarity from each others patterns of ties.
<- cluster_regular_equivalence(m182_task)
reg_res ggtree(reg_res,4)
Ok, so it looks like these nodes are much more similar in terms of their patterns of ties than their actual ties.
Like before, we’ll loop through each possible cluster solution and see how well they match the observed matrix of triad type correlations.
ggidentify_clusters(reg_res, t(task_triads))
The cluster correlation plot seems a bit ambiguous here, at least visually. But the elbow method has highlighted 2 clusters as a pretty good solution.
ggtree(reg_res, 2)
<- cutree(reg_res, 2))
(reg_clu #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
<- m182_task %>% as_tidygraph() %>% mutate(regclu = reg_clu)
m182_task autographr(m182_task, node_color = "regclu") + ggtitle("Task")
As before, we can use these clusters to blockmodel the task network.
<- blockmodel(m182_task, reg_clu))
(task_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Lucille Kelsey Melody Christopher Angelica Hadley
#> 1 1 1 1 1 1
#> Jessica Henry Wallace Bentley Carolyn Marlene
#> 1 1 1 1 1 1
#> Chandler Caitlyn Monique Alberta
#> 1 1 1 2
#>
#> Reduced form blockmodel:
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#> Block 1 Block 2
#> Block 1 0.2761905 1
#> Block 2 1.0000000 NaN
plot(task_blockmodel)
Finally, we can reduce the graph to just interactions between roles. Obviously this is not particularly informative with only two clusters though…
<- reduce_graph(task_blockmodel, c("Regulars","Geek")))
(task_reduced #> IGRAPH 776efa9 DNW- 2 3 --
#> + attr: name (v/c), weight (e/n)
#> + edges from 776efa9 (vertex names):
#> [1] Regulars->Regulars Regulars->Geek Geek ->Regulars
autographr(task_reduced)
Finally, we can try to get a sense of what our different clusters represent by generating a cluster-by-triad-type matrix. This is an m x n matrix, where m is the number of clusters and n is the 16 possible triad types. Each cell is the average number of the given triad type for each individual in the cluster:
cluster_triad_census(m182_task, reg_clu)
#> 003 012 102 021D 021U 021C 111D 111U 030T 030C 201
#> [1,] 78.13333 3.266667 19.8 0 0 0 0.5333333 0.2 0 0 0.7333333
#> [2,] 0.00000 0.000000 0.0 0 0 0 0.0000000 0.0 0 0 73.0000000
#> 120D 120U 120C 210 300
#> [1,] 0.06666667 0 0 0.6 1.666667
#> [2,] 0.00000000 0 0 6.0 26.000000
# ADVANCED: Note that we can also blockmodel our communities from last week.
# walktrap_blockmodel <- blockmodel(get.adjacency(m182_main, sparse = F),
# friend_wt$membership)
# plot(walktrap_blockmodel)
# walktrap_blockmodel
# # And graphs that from the reduced form blockmodels...
# walktrap_blockmodel_red <- graph.adjacency(walktrap_blockmodel$block.model, weighted = T)
# plot(walktrap_blockmodel_red, edge.width = E(walktrap_blockmodel_red)$weight,
# vertex.color = rainbow(2) )
# # Admittedly, not terribly interesting...
#
# edgebet_blockmodel <- blockmodel(get.adjacency(m182_main, sparse = F),
# friend_eb$membership)
# plot(edgebet_blockmodel) # blockmodel
# edgebet_blockmodel_red <- graph.adjacency(edgebet_blockmodel$block.model, weighted = T)
# plot(edgebet_blockmodel_red, edge.width=E(edgebet_blockmodel_red)$weight,
# vertex.color=rainbow(3) ) # reduced graph
# # Cool
#
# fastgreed_blockmodel <- blockmodel(get.adjacency(m182_main, sparse = F),
# friend_fg$membership)
# plot(fastgreed_blockmodel) # blockmodel
# fastgreed_blockmodel_red <- graph.adjacency(fastgreed_blockmodel$block.model, weighted = T)
# plot(fastgreed_blockmodel_red, edge.width=E(fastgreed_blockmodel_red)$weight,
# vertex.color=rainbow(3) ) # reduced graph