R 使用存在/不存在数据的共现网络_R_Tidygraph

R 使用存在/不存在数据的共现网络

R 使用存在/不存在数据的共现网络,r,tidygraph,R,Tidygraph,我正在尝试为我的细菌物种存在/不存在数据制作一个共现网络图，但不确定如何进行。我希望最终的结果是这样的，每个物种都与另一个物种相联系，如果它们都存在于同一个患者中，更大的圆圈代表更高频率的物种。我最初尝试使用widyr和tidygraph软件包，但我不确定我的数据集是否与它们兼容，因为它将患者作为列，将单个物种作为行。最好我想知道我可以使用哪些包/代码来处理我的数据集，或者我可以如何更改我的数据集来处理这些包。您可以使用矩阵叉积来获得共现矩阵。然后，使用igraph包将邻接矩阵转换成一个图是很简

我正在尝试为我的细菌物种存在/不存在数据制作一个共现网络图，但不确定如何进行。我希望最终的结果是这样的，每个物种都与另一个物种相联系，如果它们都存在于同一个患者中，更大的圆圈代表更高频率的物种。我最初尝试使用widyr和tidygraph软件包，但我不确定我的数据集是否与它们兼容，因为它将患者作为列，将单个物种作为行。最好我想知道我可以使用哪些包/代码来处理我的数据集，或者我可以如何更改我的数据集来处理这些包。

您可以使用矩阵叉积来获得共现矩阵。然后，使用

igraph

包将邻接矩阵转换成一个图是很简单的。试试这个：

library(igraph)

# Create fake data set
# rows = patients
# cols = species
set.seed(2222)
df <- matrix(sample(c(TRUE, FALSE), 50, replace = TRUE), 5)
colnames(df) <- letters[1:10]

# Generate co-occurrence matrix with crossproduct
co_mat <- t(df) %*% df

# Set diagonal values to 0
diag(co_mat) <- 0

# Assign dim names
dimnames(co_mat) <- list(colnames(df), colnames(df))

# Create graph from adjacency matrix
# ! edge weights are equal to frequency of co-occurrence
g <- graph_from_adjacency_matrix(co_mat, mode = "upper", weighted = TRUE)

# Assign nodes weight equal to species frequency
g <- set.vertex.attribute(g, "v_weight", value = colSums(df))

plot(g, vertex.size = V(g)$v_weight * 5 + 5, edge.width = E(g)$weight * 5)

结果如下：

像Istrel一样，我也推荐igraph。可以使用ggplot生成第二个解决方案

library(ggnetwork)
library(ggplot2)
library(igraph)

#sample data:
set.seed(1)
mat <- matrix(rbinom(50 * 5, 1, 0.1), ncol = 15, nrow = 100)

# This is not necessary for the example data. But in your case, if you want  species as nodes you have to do a transpose: 
#mat <- t(mat)

#### Optional! But usually there are often "empty cases" which you might want to remove: 
# remove 0-sum-columns
mat <- mat[,apply(mat, 2, function(x) !all(x==0))] 
# remove 0-sum-rows
mat <- mat[apply(mat, 1, function(x) !all(x==0)),]

# transform in term-term adjacency matrix
mat.t <- mat  %*% t(mat)

##### calculate graph 
g <- igraph::graph.adjacency(mat.t,mode="undirected",weighted=T,diag=FALSE)

# calculate coordinates (see https://igraph.org/r/doc/layout_.html for different layouts)
layout <- as.matrix(layout_with_lgl(g))

p<-ggplot(g, layout = layout, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges( color = "grey20", alpha = 0.2, size = 2) + # add e.g. curvature =  0.15 for curved edges
  geom_nodes(size =  (centralization.degree(g)$res +3) , color="darkolivegreen4", alpha = 1) +
  geom_nodes(size =  centralization.degree(g)$res , color="darkolivegreen2", alpha = 1) +
  geom_nodetext(aes(label = vertex.names), size= 5) +
  theme_blank()
p

库（ggnetwork）
图书馆（GG2）
图书馆（igraph）
#样本数据：
种子（1）
mat谢谢你的回答，我只是想澄清一下，你使用的假数据集是否就是所示的数据集，或者你是否以任何方式操纵它，将行转化为患者，将cols转化为物种？初始数据如图所示。您可以使用t（）
顺便说一句，如果答案对您有用，请不要忘记更新答案。再次感谢您，只需再问一个问题，是否有任何方法显示图形图例所需的数据，例如重要性或线宽/圆大小的键？只需再问一个问题，是否有任何方法可以显示图形图例所需的数据，例如重要性或线宽/圆大小键？您可以使用ggplot2 aes-语法！我在回答中添加了一个示例。因此，我尝试使用自己的数据执行此操作，并在代码的最后一部分使用P获得了此结果。错误：数据
必须是一个数据帧，或其他可由fortify（）强制的对象，而不是具有igraph类的S3对象。然后我用你使用的代码进行了尝试，得到了相同的错误，你能帮我吗？我认为出现错误是因为你没有加载ggnetwork包。。。（图书馆（网络））。必要时安装（Install.packages（“ggnetwork”）…我已重新安装了所有软件包（ggnetwork、ggplot2和igraph），但仍然收到相同的错误消息。对此我非常抱歉，但您还有其他建议吗？
library(ggnetwork)
library(ggplot2)
library(igraph)

#sample data:
set.seed(1)
mat <- matrix(rbinom(50 * 5, 1, 0.1), ncol = 15, nrow = 100)

# This is not necessary for the example data. But in your case, if you want  species as nodes you have to do a transpose: 
#mat <- t(mat)

#### Optional! But usually there are often "empty cases" which you might want to remove: 
# remove 0-sum-columns
mat <- mat[,apply(mat, 2, function(x) !all(x==0))] 
# remove 0-sum-rows
mat <- mat[apply(mat, 1, function(x) !all(x==0)),]

# transform in term-term adjacency matrix
mat.t <- mat  %*% t(mat)

##### calculate graph 
g <- igraph::graph.adjacency(mat.t,mode="undirected",weighted=T,diag=FALSE)

# calculate coordinates (see https://igraph.org/r/doc/layout_.html for different layouts)
layout <- as.matrix(layout_with_lgl(g))

p<-ggplot(g, layout = layout, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges( color = "grey20", alpha = 0.2, size = 2) + # add e.g. curvature =  0.15 for curved edges
  geom_nodes(size =  (centralization.degree(g)$res +3) , color="darkolivegreen4", alpha = 1) +
  geom_nodes(size =  centralization.degree(g)$res , color="darkolivegreen2", alpha = 1) +
  geom_nodetext(aes(label = vertex.names), size= 5) +
  theme_blank()
p

# calculate degree:
V(g)$Degree <- centralization.degree(g)$res

p<-ggplot(g, layout = layout, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges( color = "grey20", alpha = 0.2, size = 2) + # add e.g. curvature = 0.15 for curved edges
  geom_nodes(aes(size =  Degree) , color="darkolivegreen2", alpha = 1) +
  scale_size_continuous(range = c(5, 16)) +
  geom_nodetext(aes(label = vertex.names), size= 5) +
  theme_blank()
p