R 查找矩阵或数据帧中每个唯一列的频率
我想通过矩阵的列来求矩阵的频率。例如下面的矩阵xR 查找矩阵或数据帧中每个唯一列的频率,r,R,我想通过矩阵的列来求矩阵的频率。例如下面的矩阵x x <- matrix(c(rep(1:4,3),rep(2:5,2)),4,5) x [,1] [,2] [,3] [,4] [,5] [1,] 1 1 1 2 2 [2,] 2 2 2 3 3 [3,] 3 3 3 4 4 [4,] 4 4 4 5 5 这个答案会
x <- matrix(c(rep(1:4,3),rep(2:5,2)),4,5)
x
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 2 2
[2,] 2 2 2 3 3
[3,] 3 3 3 4 4
[4,] 4 4 4 5 5
这个答案会有点混乱,因为它涉及到我无法避免的列表:
x <- matrix(c(rep(1:4,3),rep(2:5,2)),4,5)
#convert columns to elements in list
y <- apply(x, 2, list)
#Get unique columns
unique_y <- unique(unlist(y, recursive=FALSE))
#Get column frequencies
frequencies <- sapply(unique(y), function(f) sum(unlist(y, recursive=FALSE) %in% f))
#Bind unique columns with frequencies
rbind(simplify2array(unique_y), frequencies)
下面是一个避免将矩阵转换为列表列表的解决方案,但它也有点混乱:
x.unique <- unique(x, MARGIN = 2)
freq <- apply(x.unique, MARGIN = 2,
function(b) sum(apply(x, MARGIN = 2, function(a) all(a == b)))
)
rbind(x.unique, freq)
[,1] [,2]
1 2
2 3
3 4
4 5
freq 3 2
x.unique使用aggregate
(如果您的输入是data.frame
):
y
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 2 1 1 2
# [2,] 2 3 2 3 3
# [3,] 3 4 3 4 4
# [4,] 4 5 4 5 5
zt(骨料(z,by=z,长度)[1:(ncol(z)+1)])
# [,1] [,2] [,3]
#V1 11 2
#V2 2 3 3
#V3 4 4
#V4 4 5 5
#V1.1 2
注意:如果输入矩阵x
中的列数大于其nrow,即ncol(x)>>nrow(x)
您的最终目标是什么?换句话说,您将如何进一步处理这些数据?如果只是制表的话,paste()
不就是答案吗
x <- matrix(c(rep(1:4,3),rep(2:5,2)),4,5)
x1 <- data.frame(table(apply(x, 2, paste, collapse = ", ")))
# Var1 Freq
# 1 1, 2, 3, 4 3
# 2 2, 3, 4, 5 2
或者,如果您喜欢转换输出:
t(cbind(read.csv(text = as.character(x1$Var1), header = FALSE), x1[-1]))
# [,1] [,2]
# V1 1 2
# V2 2 3
# V3 3 4
# V4 4 5
# Freq 3 2
谢谢,因为我的数据框很大,你的通知很好。
y <- matrix(c(1:4, 2:5, 1:4, 1,3,4,5, 2:5), ncol=5)
> y
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 2 1 1 2
# [2,] 2 3 2 3 3
# [3,] 3 4 3 4 4
# [4,] 4 5 4 5 5
z <- as.data.frame(t(y))
> t(aggregate(z, by=z, length)[1:(ncol(z)+1)])
# [,1] [,2] [,3]
# V1 1 1 2
# V2 2 3 3
# V3 3 4 4
# V4 4 5 5
# V1.1 2 1 2
x <- matrix(c(rep(1:4,3),rep(2:5,2)),4,5)
x1 <- data.frame(table(apply(x, 2, paste, collapse = ", ")))
# Var1 Freq
# 1 1, 2, 3, 4 3
# 2 2, 3, 4, 5 2
cbind(read.csv(text = as.character(x1$Var1), header = FALSE), x1[-1])
# V1 V2 V3 V4 Freq
# 1 1 2 3 4 3
# 2 2 3 4 5 2
t(cbind(read.csv(text = as.character(x1$Var1), header = FALSE), x1[-1]))
# [,1] [,2]
# V1 1 2
# V2 2 3
# V3 3 4
# V4 4 5
# Freq 3 2