R 分类变量到矩阵的向量
我有一个数据框,它有唯一的id和分类变量。我需要将所有唯一ID压缩到一行中,将所有适用的分类变量压缩到不同的向量中,以便以一个矩阵结束,用于一些回归分析。例:R 分类变量到矩阵的向量,r,R,我有一个数据框,它有唯一的id和分类变量。我需要将所有唯一ID压缩到一行中,将所有适用的分类变量压缩到不同的向量中,以便以一个矩阵结束,用于一些回归分析。例: id cat 1 a 2 b 1 b 3 c 4 a 2 a 4 c 3 c output: id cat.a cat.b cat.c 1 1 1 0 2 1 1 0 3 0 0
id cat
1 a
2 b
1 b
3 c
4 a
2 a
4 c
3 c
output:
id cat.a cat.b cat.c
1 1 1 0
2 1 1 0
3 0 0 2
4 1 0 1
我已经在有用的软件包中研究了build.x函数,但无法解决将其折叠为单个id的问题,这看起来就像是重塑数据
library(reshape2)
dcast(df, id ~ cat)
# Using cat as value column: use value.var to override.
# Aggregation function missing: defaulting to length
# id a b c
# 1 1 1 1 0
# 2 2 1 1 0
# 3 3 0 0 2
# 4 4 1 0 1
虽然对于这样一个简单的问题来说,这可能是矫枉过正了。正如@Seth在评论中指出的那样,您可以使用表
with(df, table(id, cat))
# cat
# id a b c
# 1 1 1 0
# 2 1 1 0
# 3 0 0 2
# 4 1 0 1
(使用此数据:)
我认为这可以在不使用任何必需的库的情况下完成所需的任务——尽管它确实使用了两个嵌套循环,所以速度可能会很慢
## setting up the data you gave as an example in your question
dat=matrix(c(1,2,1,3,4,2,4,3,'a','b','b','c','a','a','c','c'),ncol=2)
data=data.frame(dat)
## determine the categories as defined by your data
cats <- levels(data$X2)
## create a blank matrix
out=matrix(0,nrow=length(levels(data$X1)),ncol=length(levels(data$X2)))
## what is the lowest value of your first column
i=min(as.numeric(data$X1))
## j will serve as a counter for the rows in the out matrix
j=1
while(i<=max(as.numeric(data$X1)))
{
## find the unique values associated with the first 'i'
idi <- which(as.numeric(data$X1)==i)
## set up a counter that corresponds to the columns of your out matrix
k=1
while(k<= length(cats)) {
## determine the values associated with the particular category
out[j,k] <- length(which(data[idi,2]==cats[k]))
k=k+1
}
i=i+1
j=j+1
}
##设置您在问题中给出的数据作为示例
dat=矩阵(c(1,2,1,3,4,2,4,3,'a','b','b','c','a','a','c','c',ncol=2)
数据=数据帧(dat)
##确定数据定义的类别
猫的桌子(id,cat)离你想要的很近吗?@Seth这正是我想要的,干杯!说得好!这只是我在自己的编码实践中的习惯。
## setting up the data you gave as an example in your question
dat=matrix(c(1,2,1,3,4,2,4,3,'a','b','b','c','a','a','c','c'),ncol=2)
data=data.frame(dat)
## determine the categories as defined by your data
cats <- levels(data$X2)
## create a blank matrix
out=matrix(0,nrow=length(levels(data$X1)),ncol=length(levels(data$X2)))
## what is the lowest value of your first column
i=min(as.numeric(data$X1))
## j will serve as a counter for the rows in the out matrix
j=1
while(i<=max(as.numeric(data$X1)))
{
## find the unique values associated with the first 'i'
idi <- which(as.numeric(data$X1)==i)
## set up a counter that corresponds to the columns of your out matrix
k=1
while(k<= length(cats)) {
## determine the values associated with the particular category
out[j,k] <- length(which(data[idi,2]==cats[k]))
k=k+1
}
i=i+1
j=j+1
}