[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] Calculating Q-Spa modularity for spatially embedded network
From: |
Andrew Edelman |
Subject: |
Re: [igraph] Calculating Q-Spa modularity for spatially embedded networks |
Date: |
Fri, 15 Apr 2011 11:57:43 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Lightning/1.0b2 Thunderbird/3.1.9 |
Hi Tom,
I'm also interested in spatially-embedded networks and am exploring
Q-spa. I've written some R code (with comments) below which implements
this form of modularity using the edge betweenness detection method. The
code was written for igraph 0.6. To build the Pspa matrix, you'll need
determine how you want to bin your distance groups and run the second
set of codes for each distance group to fill in the Pspa matrix. I think
I implemented the calculation of the Pspa matrix correctly. The results
match the criteria that the sum of Pspa equals the sum of the adjacency
matrix. Also, If you substitute the traditional Newman Girvan null model
matrix (ki*kj/2m) you get the same results as the igraph
edge.betweenness.community function. I haven't tried it with a weighted
network, but it should be able to handle weights with some minor
modifications. Let me know if you find any errors in my calculations or
code.
Interestingly, you can also use this method to incorporate other
variables besides distance into the modularity calculation (e.g., age
differences, etc.).
Andrew
#Community detection with Q spa modularity
library(fields) #package contains function to calculate distance matrix
distm <- rdist(coor,coor) #Calculate pair-wise distance matrix based on
coordinates (e.g., coor)
deg <- degree(g) #calculate degree for all nodes in graph
NNmatrix <- outer(deg,deg) #create matrix with all node combinations of
degree products
Pspa <- matrix(0,vcount(g),vcount(g)) #Create empty Probability matrix
gaj <- get.adjacency(g, sparse=F) #create adjacency matrix
#Build Pspa null model matrix
hist(distm) #Examine histogram of distances for selecting binned
distance groups
y <- which(distm==0, arr.ind=T)#Find index (row, column) of elements in
distance matrix that match binning criteria (change as needed)
SAij <- sum(gaj[y]) #Sum of all edge weights in distance group
SNij <- sum(NNmatrix[y]) #Sum all elements in degree product matrix for
distance group
fd <- SAij/SNij #Calculate weighted average of the probability for a
link to exist at specific distance
Pspa[y] <- fd*NNmatrix[y] #Calculate Pspa for distance group and enter
in Pspa matrix
#repeat above 5 steps for all distance groups (e.g., y <- which(distm>0
& distm<=100, arr.ind=T))
#Determine best edge betweenness community structure based on optimal Qspa
#Removes only first edge if ties (like igraph)
mod <- vector() #create empty vector for modularity scores
redge <- vector() #create empty vector for removed edges
edge-betweenness score
membs <- list() #create empty list for memberships
g1 <- g #Create duplicate graph for removing edges
#For loop sequentially removes highest edge-betweenness edges and
calculates Qspa and membership
#Returns a "stack imbalance" error, but seems to function correctly anyways
for(i in 1:ecount(g1)){
redge[i] <- max(edge.betweenness(g1))
eb <- edge.betweenness(g1)
g1 <- delete.edges(g1, which.max(eb))
membs[[i]] <- clusters(g1)$membership
memb <- clusters(g1)$membership
delta <- matrix(0,vcount(g),vcount(g))
for(r in 1:vcount(g)){
delta[r,] <- match(memb,memb[r],nomatch=0)
}
mod[i] <- sum((gaj - Pspa)*delta)/(2*ecount(g))
}
#Useful functions for exploring results
ebc.Qspa <- membs[[which.max(mod)]] #Community membership that maximizes
Qspa modularity
max(mod) #Optimal Qspa modularity value
V(g)$color <- ebc.Qspa #community assignment as a node color attribute
ebc <- edge.betweenness.community(g) #Determine optimal community
structure based on traditional Q
compare(ebc, ebc.Qspa, method = "nmi") #Compare similarity in community
structures between Q and Qspa
On 4/11/2011 4:22 AM, T.O. Richardson wrote:
Dear List,
I am relatively new to network analysis, and the networks I have are not
simple: weighted, dynamic, undirected, and spatially-embedded.
According to this 2010 paper<http://arxiv.org/abs/1012.3409>, the classic
Newman-Girvan modularity provides misleading community detection
classifications when applied to spatially-embedded networks, because focus
constraint and homophily are confounded in space.
So I was wondering (hoping) how easy it would be to implement the Q-spar
modularity measure (equations5-6 in that paper) in R generally (e.g. using
igraph or tnet).
I am no mathematician, but can code OK in R. From the look of equations, I
am guessing that practically it will mean creating another matrix
containing the pairwise distances between nodes, then.. well that's when it
gets fuzzy.
Anyway, all pointers appreciated,
Best
Tom Richardson
-----------------------------------------------
Dept. of Maths and Stats, UWE and Ant Lab, Bristol
Room B79.
SoBS, Woodland Rd. BS8 1UG.
+44 (0)117 928 8443
----------------------------------------------------
_______________________________________________
igraph-help mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/igraph-help