R 熔化并传送一个笨拙的数据帧

R 熔化并传送一个笨拙的数据帧,r,reshape2,R,Reshape2,我正在处理一个类似这样的数据帧。我想看起来像: 省:区:甲方投票:甲方百分比:乙方投票:乙方百分比:丙方投票:丙方百分比 现在候选名称作为唯一标识符运行良好,以避免对聚合函数的需要,但我最终可以删除它 candidate<-c('bob jones', 'bobby jones', 'sara jones', 'sara norah', 'nora jones', 'other name', 'name other', 'thomas name', 'name judge', 'my ma

我正在处理一个类似这样的数据帧。我想看起来像:

省:区:甲方投票:甲方百分比:乙方投票:乙方百分比:丙方投票:丙方百分比

现在候选名称作为唯一标识符运行良好,以避免对聚合函数的需要,但我最终可以删除它

candidate<-c('bob jones', 'bobby jones', 'sara jones', 'sara norah', 'nora jones', 'other name', 'name other', 'thomas name', 'name judge', 'my mayor', 'peter peter', 'paul paul')
party<-rep(c('A', 'B', 'C'), 4)
district<-c(rep('District 1', 3), rep('District 2', 3), rep('District 3', 3), rep('Disctrict 4', 3))
province<-c(rep('Province 1', 3), rep('Province 2', 3), rep('Province 3', 3), rep('Province 4', 3))
votes<-round(rnorm(12, mean=5000, sd=1000),0)
percent<-round(rnorm(12, mean=37, sd=10),2)
df<-data.frame(party, district,province, votes, percent, candidate)
但是,当我在数据集中使用同一个调用时,我不再具有唯一标识符,并且这会根据长度进行聚合


希望你能帮忙。谢谢

data.table v1.9.5
中,
dcast
可以对多个
value.var
列进行强制转换。有了这些,您可以:

require(data.table) #v1.9.5+
ans = dcast(setDT(df), province + district ~ party, value.var = c("votes", "percent"))
#      province    district votes_A votes_B votes_C percent_A percent_B percent_C
# 1: Province 1  District 1    3072    3149    4262     34.29     18.45     19.20
# 2: Province 2  District 2    5918    3970    4201     36.56     46.22     43.16
# 3: Province 3  District 3    5593    5208    5260     26.58     31.20     39.00
# 4: Province 4 Disctrict 4    6138    4537    6293     43.97     43.62     32.48
如果您想要返回
data.frame
,则可以执行
setDF(ans)
ans
转换为
data.frame


您可以通过以下方式安装
v1.9.5

这里是一个基本解决方案:

set.seed(1)
candidate<-c('bob jones', 'bobby jones', 'sara jones', 'sara norah', 'nora jones', 'other name', 'name other', 'thomas name', 'name judge', 'my mayor', 'peter peter', 'paul paul')
party<-rep(c('A', 'B', 'C'), 4)
district<-c(rep('District 1', 3), rep('District 2', 3), rep('District 3', 3), rep('Disctrict 4', 3))
province<-c(rep('Province 1', 3), rep('Province 2', 3), rep('Province 3', 3), rep('Province 4', 3))
votes<-round(rnorm(12, mean=5000, sd=1000),0)
percent<-round(rnorm(12, mean=37, sd=10),2)
df<-data.frame(party, district,province, votes, percent, candidate)


reshape(df, direction = 'wide', times = c('votes','percent'),
        idvar = c('province', 'district'), 
        timevar = 'party', drop = 'candidate')

#       district   province votes.A percent.A votes.B percent.B votes.C percent.C
# 1   District 1 Province 1    4374     30.79    5184     14.85    4164     48.25
# 4   District 2 Province 2    6595     36.55    5330     36.84    4180     46.44
# 7   District 3 Province 3    5487     45.21    5738     42.94    5576     46.19
# 10 Disctrict 4 Province 4    4695     44.82    6512     37.75    5390     17.11
set.seed(1)
候选人
require(data.table) #v1.9.5+
ans = dcast(setDT(df), province + district ~ party, value.var = c("votes", "percent"))
#      province    district votes_A votes_B votes_C percent_A percent_B percent_C
# 1: Province 1  District 1    3072    3149    4262     34.29     18.45     19.20
# 2: Province 2  District 2    5918    3970    4201     36.56     46.22     43.16
# 3: Province 3  District 3    5593    5208    5260     26.58     31.20     39.00
# 4: Province 4 Disctrict 4    6138    4537    6293     43.97     43.62     32.48
set.seed(1)
candidate<-c('bob jones', 'bobby jones', 'sara jones', 'sara norah', 'nora jones', 'other name', 'name other', 'thomas name', 'name judge', 'my mayor', 'peter peter', 'paul paul')
party<-rep(c('A', 'B', 'C'), 4)
district<-c(rep('District 1', 3), rep('District 2', 3), rep('District 3', 3), rep('Disctrict 4', 3))
province<-c(rep('Province 1', 3), rep('Province 2', 3), rep('Province 3', 3), rep('Province 4', 3))
votes<-round(rnorm(12, mean=5000, sd=1000),0)
percent<-round(rnorm(12, mean=37, sd=10),2)
df<-data.frame(party, district,province, votes, percent, candidate)


reshape(df, direction = 'wide', times = c('votes','percent'),
        idvar = c('province', 'district'), 
        timevar = 'party', drop = 'candidate')

#       district   province votes.A percent.A votes.B percent.B votes.C percent.C
# 1   District 1 Province 1    4374     30.79    5184     14.85    4164     48.25
# 4   District 2 Province 2    6595     36.55    5330     36.84    4180     46.44
# 7   District 3 Province 3    5487     45.21    5738     42.94    5576     46.19
# 10 Disctrict 4 Province 4    4695     44.82    6512     37.75    5390     17.11