基R或dplyr中矩阵内矩阵的平均值
考虑以下矩阵:基R或dplyr中矩阵内矩阵的平均值,r,dplyr,R,Dplyr,考虑以下矩阵: tt <- structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 223.26217771938, NA, NA, NA, NA, NA, NA, NA, NA, NA, 233.317380407033, 228.230147000785, NA, NA, NA, NA, NA, NA, NA, NA, 213.976634238414, 202.420354707722, 235.306183514161,
tt <- structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 223.26217771938,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 233.317380407033, 228.230147000785,
NA, NA, NA, NA, NA, NA, NA, NA, 213.976634238414, 202.420354707722,
235.306183514161, NA, NA, NA, NA, NA, NA, NA, 234.959570990415,
209.098063118719, 218.561204242656, 222.512920973143, NA, NA,
NA, NA, NA, NA, 208.300264042079, 215.937490955137, 237.957979483774,
192.688868386319, 235.076583265965, NA, NA, NA, NA, NA, 206.523606398881,
223.937491278258, 223.926327170344, 214.32218737219, 226.512692801088,
201.218786399282, NA, NA, NA, NA, 224.281073655358, 213.943917885038,
238.593797069413, 203.435493461687, 229.752040252094, 219.155196151038,
218.091723822799, NA, NA, NA, 220.671701855947, 201.380237362061,
232.187424293393, 191.10206696946, 234.448288541418, 178.759615126012,
214.037379912949, 204.514058196497, NA, NA, 232.924880594581,
229.573517636508, 197.886331008486, 231.900840878165, 221.634834807167,
227.927620090238, 232.886238322491, 239.428486191598, 231.987068605127,
NA), .Dim = c(10L, 10L), .Dimnames = list(c("SA1", "SA1", "SA1",
"SA1", "SA2", "SA2", "SA2", "SA2", "SA2", "SA2"), c("SA1", "SA1",
"SA1", "SA1", "SA2", "SA2", "SA2", "SA2", "SA2", "SA2")))
我想计算SA1和SA2子矩阵的平均值。所谓sub_矩阵,我指的是只有SA1个相等的行名和列名,也只有SA2个相等的行名和列名。对于SA1,这类似于
平均值(tt[1:4,1:4],na.rm=T)
,但是我的实矩阵比这个示例大得多,因此基本子设置不是一个解决方案,而是通过不同的行.names
和colnames
进行某种分组。如果有人能向我展示一个同时使用base R和dplyr的解决方案,那就太棒了。我们可以使用sapply
循环矩阵的所有唯一的列名,对它们进行子集划分,并取每个子矩阵的平均值
sapply(unique(colnames(tt)), function(x)
mean(tt[rownames(tt) == x, colnames(tt) == x], na.rm = TRUE))
# SA1 SA2
#222.8 221.0
这就产生了一个名为sub_list
的向量,该向量以唯一列名的向量开始,然后在子集中迭代,这些名称被替换为means(您可以将它们输出到另一个向量,但如果一个就足够了,为什么要生成两个呢?)
sub_list带有tidyverse
的选项。我们可以将的“tt”转换为“long”格式。筛选行名和列名相同的行,然后按“Var1”分组,获得“value”列的mean
library(dplyr)
library(reshape2)
melt(tt) %>%
filter(Var1 == Var2) %>%
group_by(Var1) %>%
summarise(value = mean(value, na.rm = TRUE))
# A tibble: 2 x 2
# Var1 value
# <fct> <dbl>
#1 SA1 223.
#2 SA2 221.
库(dplyr)
图书馆(E2)
熔体(tt)%>%
过滤器(Var1==Var2)%>%
分组依据(Var1)%>%
总结(值=平均值(值,na.rm=真))
#一个tibble:2x2
#Var1值
#
#1 SA1223。
#2 SA2 221。
sub_list <- unique(colnames(tt))
for(j in 1:length(sub_list)){
sub_list[j] <- mean(tt[,colnames(tt) == sub_list[j]], na.rm = TRUE)
}
library(dplyr)
library(reshape2)
melt(tt) %>%
filter(Var1 == Var2) %>%
group_by(Var1) %>%
summarise(value = mean(value, na.rm = TRUE))
# A tibble: 2 x 2
# Var1 value
# <fct> <dbl>
#1 SA1 223.
#2 SA2 221.