R 以特定形式重塑数据
我的数据如下,这是一个简单的数据集,但实际上我做过几次R 以特定形式重塑数据,r,reshape,data-munging,R,Reshape,Data Munging,我的数据如下,这是一个简单的数据集,但实际上我做过几次实验,这是一个简化的数据集: DF=structure(list(theoric = c("E", "E", "F", "F", "F"), observed = c("E", "E", "F", "F", "E"), experiment = c("RO(2)", "RO(2)", "RO(2)", "RO(2)", "RO(2)")), .Names = c("theoric", "observed", "experiment"),
实验
,这是一个简化的数据集:
DF=structure(list(theoric = c("E", "E", "F", "F", "F"), observed = c("E",
"E", "F", "F", "E"), experiment = c("RO(2)", "RO(2)", "RO(2)", "RO(2)",
"RO(2)")), .Names = c("theoric", "observed", "experiment"), row.names = 2:6, class = "data.frame")
现在,我的数据具有以下形式:
theoric observed experiment
2 E E RO(2)
3 E E RO(2)
4 F F RO(2)
5 F F RO(2)
6 F E RO(2)
此外,我希望对其进行如下重塑:
2 3 4 5 6
RO(2) theoric E E F F F
RO(2) observed E E F F E
最简单的方法是什么?我真的不知道怎么做。我试过了
meltR <- melt(DF, id="experiment")
输出:
col2 col1.2 col1.3 col1.4 col1.5 col1.6 col1.24 col1.25 col1.26
1 RO theoric E E F F F <NA> <NA> <NA>
6 MO theoric <NA> <NA> <NA> <NA> <NA> E F F
12 EL theoric <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
16 RO observed E E F F E <NA> <NA> <NA>
21 MO observed <NA> <NA> <NA> <NA> <NA> F F F
27 EL observed <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
col1.27 col1.28 col1.29 col1.21 col1.22 col1.23 col1.13
1 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
6 F F F <NA> <NA> <NA> <NA>
12 <NA> <NA> <NA> E E E E
16 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
21 F F F <NA> <NA> <NA> <NA>
27 <NA> <NA> <NA> E E E F
根据预期的输出,我们可能需要创建一个列,其中包含
row.names
。通过unlist
ing前两列、复制“实验”列和rownames列,创建一个新的数据集(“df2”)。然后从base R
使用restrape
将“长”格式转换为“宽”
df2 <- data.frame(col1 = unlist(DF[1:2], use.names=FALSE),
col2 = paste( rep(DF$experiment, 2),
rep(colnames(DF)[1:2], each = nrow(DF))), col3 = rep(row.names(DF), 2))
reshape(df2, idvar = "col2", direction="wide", timevar = "col3")
# col2 col1.2 col1.3 col1.4 col1.5 col1.6
#1 RO(2) theoric E E F F F
#6 RO(2) observed E E F F E
更新
使用新的数据集
library(data.table)#v1.9.7+
dcast(melt(setDT(DF), id.var = "experiment"), paste(experiment,
variable)~rowid(experiment, variable), value.var="value", fill="")
# experiment 1 2 3 4 5 6
#1: EL observed E E E F
#2: EL theoric E E E E
#3: MO observed F F F F F F
#4: MO theoric E F F F F F
#5: RO observed E E F F E
#6: RO theoric E E F F F
您还可以执行以下操作:
require(tidyverse)
DF %>%
gather(type, val, theoric, observed) %>%
unite(experiment, experiment, type, sep=" ") %>%
group_by(experiment) %>%
mutate(experiment_number = 1:n()) %>%
spread(experiment_number, val, fill="")
这给了你:
experiment `1` `2` `3` `4` `5` `6`
* <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 EL observed E E E F
2 EL theoric E E E E
3 MO observed F F F F F F
4 MO theoric E F F F F F
5 RO observed E E F F E
6 RO theoric E E F F F
实验'1``2``3``4``5``6`
*
观察到1 EL E F
2 EL理论E
观察到3个月的F
4mo理论ef
5 RO观察到E F E
6反理论
非常感谢这个完整的答案,它适用于我发布的DF,但是当我使用我拥有的完整数据集时,它不起作用,为什么?请参见编辑。谢谢。@ranell我用第二种溶液用dcast/melt
进行了测试。它确实给出了输出,但唯一的问题是,所有的组合都不在数据中,所以这些元素都用NA填充。i、 e.“MO”实验只有24:28的行号,同样,其他实验也有同样的问题。Rowid似乎是data.table的函数。它返回我“eval中的错误(expr,envir,enclose):在devel版本即1.9.7中找不到函数“rowid”@ranell。如果使用1.9.6
,则dcast(melt(setDT(DF),id.var=“experiment”)[,rid:=1.N,(experiment,variable)],粘贴(experiment,variable)~rid,value.var=“value”,fill=”“)
非常感谢您的耐心,它现在可以正常工作:)如果您查看新数据集的输入和输出,很明显,某些数字只出现在某些实验中。因此,我不确定您在编辑过程中的预期,EL没有显示。因此,我认为它不正确输出您是对的,抱歉。您认为它可以吗像那样?有些值仍然不正常,你能检查我的更新吗
library(data.table)#v1.9.7+
dcast(melt(setDT(DF), id.var = "experiment"), paste(experiment,
variable)~rowid(experiment, variable), value.var="value", fill="")
# experiment 1 2 3 4 5 6
#1: EL observed E E E F
#2: EL theoric E E E E
#3: MO observed F F F F F F
#4: MO theoric E F F F F F
#5: RO observed E E F F E
#6: RO theoric E E F F F
require(tidyverse)
DF %>%
gather(type, val, theoric, observed) %>%
unite(experiment, experiment, type, sep=" ") %>%
group_by(experiment) %>%
mutate(experiment_number = 1:n()) %>%
spread(experiment_number, val, fill="")
experiment `1` `2` `3` `4` `5` `6`
* <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 EL observed E E E F
2 EL theoric E E E E
3 MO observed F F F F F F
4 MO theoric E F F F F F
5 RO observed E E F F E
6 RO theoric E E F F F