R 以特定形式重塑数据

R 以特定形式重塑数据,r,reshape,data-munging,R,Reshape,Data Munging,我的数据如下,这是一个简单的数据集,但实际上我做过几次实验,这是一个简化的数据集: DF=structure(list(theoric = c("E", "E", "F", "F", "F"), observed = c("E", "E", "F", "F", "E"), experiment = c("RO(2)", "RO(2)", "RO(2)", "RO(2)", "RO(2)")), .Names = c("theoric", "observed", "experiment"),

我的数据如下,这是一个简单的数据集,但实际上我做过几次
实验
,这是一个简化的数据集:

DF=structure(list(theoric = c("E", "E", "F", "F", "F"), observed = c("E", 
"E", "F", "F", "E"), experiment = c("RO(2)", "RO(2)", "RO(2)", "RO(2)", 
"RO(2)")), .Names = c("theoric", "observed", "experiment"), row.names = 2:6, class = "data.frame")
现在,我的数据具有以下形式:

  theoric observed  experiment
2       E        E RO(2)
3       E        E RO(2)
4       F        F RO(2)
5       F        F RO(2)
6       F        E RO(2)
此外,我希望对其进行如下重塑:

                  2 3 4 5 6
RO(2) theoric     E E F F F
RO(2) observed    E E F F E
最简单的方法是什么?我真的不知道怎么做。我试过了

meltR <- melt(DF, id="experiment")
输出:

    col2 col1.2 col1.3 col1.4 col1.5 col1.6 col1.24 col1.25 col1.26
1   RO theoric      E      E      F      F      F    <NA>    <NA>    <NA>
6   MO theoric   <NA>   <NA>   <NA>   <NA>   <NA>       E       F       F
12  EL theoric   <NA>   <NA>   <NA>   <NA>   <NA>    <NA>    <NA>    <NA>
16 RO observed      E      E      F      F      E    <NA>    <NA>    <NA>
21 MO observed   <NA>   <NA>   <NA>   <NA>   <NA>       F       F       F
27 EL observed   <NA>   <NA>   <NA>   <NA>   <NA>    <NA>    <NA>    <NA>
   col1.27 col1.28 col1.29 col1.21 col1.22 col1.23 col1.13
1     <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
6        F       F       F    <NA>    <NA>    <NA>    <NA>
12    <NA>    <NA>    <NA>       E       E       E       E
16    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
21       F       F       F    <NA>    <NA>    <NA>    <NA>
27    <NA>    <NA>    <NA>       E       E       E       F

根据预期的输出,我们可能需要创建一个列,其中包含
row.names
。通过
unlist
ing前两列、复制“实验”列和rownames列,创建一个新的数据集(“df2”)。然后从
base R
使用
restrape
将“长”格式转换为“宽”

df2 <- data.frame(col1 = unlist(DF[1:2], use.names=FALSE), 
      col2 = paste( rep(DF$experiment, 2),
    rep(colnames(DF)[1:2], each = nrow(DF))), col3 = rep(row.names(DF), 2))

reshape(df2, idvar = "col2", direction="wide", timevar = "col3")
#             col2 col1.2 col1.3 col1.4 col1.5 col1.6
#1  RO(2) theoric      E      E      F      F      F
#6 RO(2) observed      E      E      F      F      E
更新 使用新的数据集

library(data.table)#v1.9.7+
dcast(melt(setDT(DF), id.var = "experiment"), paste(experiment, 
    variable)~rowid(experiment, variable), value.var="value", fill="")
#    experiment 1 2 3 4 5 6
#1: EL observed E E E F    
#2:  EL theoric E E E E    
#3: MO observed F F F F F F
#4:  MO theoric E F F F F F
#5: RO observed E E F F E  
#6:  RO theoric E E F F F  

您还可以执行以下操作:

require(tidyverse)                                                                                                                                                                                                                  
DF %>% 
  gather(type, val, theoric, observed) %>% 
  unite(experiment, experiment, type, sep=" ") %>% 
  group_by(experiment) %>% 
  mutate(experiment_number = 1:n()) %>% 
  spread(experiment_number, val, fill="")
这给了你:

   experiment   `1`   `2`   `3`   `4`   `5`   `6`
*       <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 EL observed     E     E     E     F            
2  EL theoric     E     E     E     E            
3 MO observed     F     F     F     F     F     F
4  MO theoric     E     F     F     F     F     F
5 RO observed     E     E     F     F     E      
6  RO theoric     E     E     F     F     F      
实验'1``2``3``4``5``6`
*             
观察到1 EL E F
2 EL理论E
观察到3个月的F
4mo理论ef
5 RO观察到E F E
6反理论

非常感谢这个完整的答案,它适用于我发布的DF,但是当我使用我拥有的完整数据集时,它不起作用,为什么?请参见编辑。谢谢。@ranell我用第二种溶液用
dcast/melt
进行了测试。它确实给出了输出,但唯一的问题是,所有的组合都不在数据中,所以这些元素都用NA填充。i、 e.“MO”实验只有24:28的行号,同样,其他实验也有同样的问题。Rowid似乎是data.table的函数。它返回我“eval中的错误(expr,envir,enclose):在devel版本即1.9.7中找不到函数“rowid”@ranell。如果使用
1.9.6
,则
dcast(melt(setDT(DF),id.var=“experiment”)[,rid:=1.N,(experiment,variable)],粘贴(experiment,variable)~rid,value.var=“value”,fill=”“)
非常感谢您的耐心,它现在可以正常工作:)如果您查看新数据集的输入和输出,很明显,某些数字只出现在某些实验中。因此,我不确定您在编辑过程中的预期,EL没有显示。因此,我认为它不正确输出您是对的,抱歉。您认为它可以吗像那样?有些值仍然不正常,你能检查我的更新吗
library(data.table)#v1.9.7+
dcast(melt(setDT(DF), id.var = "experiment"), paste(experiment, 
    variable)~rowid(experiment, variable), value.var="value", fill="")
#    experiment 1 2 3 4 5 6
#1: EL observed E E E F    
#2:  EL theoric E E E E    
#3: MO observed F F F F F F
#4:  MO theoric E F F F F F
#5: RO observed E E F F E  
#6:  RO theoric E E F F F  
require(tidyverse)                                                                                                                                                                                                                  
DF %>% 
  gather(type, val, theoric, observed) %>% 
  unite(experiment, experiment, type, sep=" ") %>% 
  group_by(experiment) %>% 
  mutate(experiment_number = 1:n()) %>% 
  spread(experiment_number, val, fill="")
   experiment   `1`   `2`   `3`   `4`   `5`   `6`
*       <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 EL observed     E     E     E     F            
2  EL theoric     E     E     E     E            
3 MO observed     F     F     F     F     F     F
4  MO theoric     E     F     F     F     F     F
5 RO observed     E     E     F     F     E      
6  RO theoric     E     E     F     F     F