使用mdply和mutate的data.frame排列(从列表创建)选项
我正在使用使用mdply和mutate的data.frame排列(从列表创建)选项,r,list,dataframe,plyr,reshape,R,List,Dataframe,Plyr,Reshape,我正在使用mdply和mutate将列表放入data.frame中,使用以下信息: 虽然我希望结果是这样安排的: # DFx DAY MONTH YEAR pLength V1 V2 V3 1 1 1 0.00 1 NA NA 1 1 1 0.25 NA 2.00 NA 1 1 1 1.00 NA NA 1 2 2 1 0.00 2 NA NA 2 2 1 0.50 NA 2.5
mdply
和mutate
将列表放入data.frame中,使用以下信息:
虽然我希望结果是这样安排的:
# DFx
DAY MONTH YEAR pLength V1 V2 V3
1 1 1 0.00 1 NA NA
1 1 1 0.25 NA 2.00 NA
1 1 1 1.00 NA NA 1
2 2 1 0.00 2 NA NA
2 2 1 0.50 NA 2.50 NA
2 2 1 1.00 NA NA 3
2 3 2 0.00 2 NA NA
2 3 2 0.65 NA 2.35 NA
2 3 2 1.00 NA NA 3
在下面的代码中,我是否可以采用不同的格式来获得DFx
?我尝试过预测,但没有成功。或者,除了mutate
之外,还有哪些选项可以与mdply
一起使用,从而实现我想要的最终结果
编辑:我当前的解决方案是将预测保存为csv,在Excel中打开,对列进行文本分割,只留下一列名为variable的变量名(即V1、V2、V3),将其带回r,最后dcastdcast(pred、DATE+pLength~变量)
df1如果你只想把预测
转换成DFx
,你不能这样做吗
DFx <- predictions
DFx <- cbind(DFx,
V1=ifelse(substr(DFx$X1,7,8)=="V1",DFx$pred,NA),
V2=ifelse(substr(DFx$X1,7,8)=="V2",DFx$pred,NA),
V3=ifelse(substr(DFx$X1,7,8)=="V3",DFx$pred,NA))
DFx <- DFx[,-6] # delete "pred" column
DFx此解决方案有效。然而,我的真实数据集有名称长度不同的变量,大约有20个。我认为在那一点上,走我愚蠢的Excel路线会更快。
df1 <- structure(list(DAY = c(1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L,
2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), MONTH = c(1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), YEAR = c(1L,
1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L), pLength = c(0L,
0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L,
1L), variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("V1", "V2", "V3"
), class = "factor"), value = c(1L, 2L, 2L, 3L, 1L, 1L, 2L, 3L,
3L, 2L, 2L, 2L, 3L, 1L, 1L, 1L, 3L, 3L)), .Names = c("DAY", "MONTH",
"YEAR", "pLength", "variable", "value"), row.names = c(NA, -18L
), class = "data.frame")
df2 <- structure(list(DAY = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), MONTH = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), YEAR = c(1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L), pLength = c(0, 0.25, 1, 0, 0.5, 1, 0, 0.65,
1), X1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), X2 = c(0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), X3 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), X4 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("DAY",
"MONTH", "YEAR", "pLength", "X1", "X2", "X3", "X4"), class = "data.frame", row.names = c(NA,
-9L))
# choose colums from df2 that will be used to receive the predicted values
recvars <- c("DAY", "MONTH", "YEAR", "pLength")
rec <- df2[recvars]
recList <- dlply(rec, c("DAY", "MONTH", "YEAR", "pLength"))
# create list of models that predict the value by pLength
models <- dlply(df1, c("DAY", "MONTH", "YEAR", "variable"), function(df)
lm(value ~ pLength, data = df))
# get predicted values
predictions <- mdply(cbind(mod = models, df = recList), function(mod, df) {
mutate(df, pred = predict(mod, newdata = df))
})
DFx <- predictions
DFx <- cbind(DFx,
V1=ifelse(substr(DFx$X1,7,8)=="V1",DFx$pred,NA),
V2=ifelse(substr(DFx$X1,7,8)=="V2",DFx$pred,NA),
V3=ifelse(substr(DFx$X1,7,8)=="V3",DFx$pred,NA))
DFx <- DFx[,-6] # delete "pred" column