R 基于两个数据帧中两个现有列之间的匹配添加和填充新列
我有两个数据帧(df1和df2),其示例如下:R 基于两个数据帧中两个现有列之间的匹配添加和填充新列,r,R,我有两个数据帧(df1和df2),其示例如下: df1 <- data.frame(StationID = c(1,1,1,2,2,3,3,3,3,3), Cameras = c("Cam1","Cam2","Cam2","Cam1","Cam1","Cam2","Cam1","Cam2","Cam1","Cam1"), Start = c("2013-04-23","2013-04-23","2013-0
df1 <- data.frame(StationID = c(1,1,1,2,2,3,3,3,3,3),
Cameras = c("Cam1","Cam2","Cam2","Cam1","Cam1","Cam2","Cam1","Cam2","Cam1","Cam1"),
Start = c("2013-04-23","2013-04-23","2013-04-23","2013-04-23","2013-04-23","2013-04-23","2013-04-23","2013-04-23","2013-04-23","2013-04-23"),
End = c("2013-04-25","2013-04-25","2013-04-25","2013-04-25","2013-04-25","2013-04-25","2013-04-25","2013-04-25","2013-04-25","2013-04-25"))
df2 <- data.frame(StationID = c(1,1,2,2,3,3),
Cameras = c("Cam1","Cam2","Cam1","Cam2","Cam1","Cam2"))
df1也许这有帮助
library(dplyr)
library(tidyr)
full_join(df1,df2) %>% group_by(StationID,Cameras) %>% summarise_each(funs(toString)) %>% separate(col = Start,into = paste("Start",1:3,sep=""),sep=", ",extra="merge") %>% separate(col = End,into = paste("End",1:3,sep=""),sep=", ",extra="merge")
我们将“StationID”和“Cameras”上的两个数据集连接起来,并使用data.table
中的dcast
,它可以将多个value.var
列重塑为“宽”格式
library(data.table)#1.9.7+
dcast(setDT(df1)[df2, on = c("StationID", "Cameras")],
StationID + Cameras ~rowid(StationID, Cameras), value.var = c("Start", "End"))
# StationID Cameras Start_1 Start_2 Start_3 End_1 End_2 End_3
#1: 1 Cam1 2013-04-23 NA NA 2013-04-25 NA NA
#2: 1 Cam2 2013-04-23 2013-04-23 NA 2013-04-25 2013-04-25 NA
#3: 2 Cam1 2013-04-23 2013-04-23 NA 2013-04-25 2013-04-25 NA
#4: 2 Cam2 NA NA NA NA NA NA
#5: 3 Cam1 2013-04-23 2013-04-23 2013-04-23 2013-04-25 2013-04-25 2013-04-25
#6: 3 Cam2 2013-04-23 2013-04-23 NA 2013-04-25 2013-04-25 NA
注:rowid
来自数据。表1.9.7。它可以从安装。如果我们有版本1.9.6或更高版本,请通过
dN <- setDT(df1)[df2, on = c("StationID", "Cameras")
][, rid := 1:.N, .(StationID, Cameras)]
谢谢你的建议。这似乎正是我要找的。然而,当我运行代码时,我得到了以下错误(eval(expr、envir、enclose)中的错误):找不到函数“rowid”。我做错了什么?我设法让它工作了。我有data.table的CRAN版本,而不是包含rowid函数的开发版本(v.1.9.7)。对于需要访问此版本的用户,只需转到。再次感谢大家的帮助!谢谢你的建议。这似乎在某种程度上起了作用;2号站摄像机2不见了。我想这是因为所有的NA,但是有没有一种方法可以保存df3中的所有记录,包括构成所有NA的记录?Thanks@Ross:补充说。对不起,我没注意到。
dN <- setDT(df1)[df2, on = c("StationID", "Cameras")
][, rid := 1:.N, .(StationID, Cameras)]
dcast(dN, StationID + Cameras ~rid, value.var = c("Start", "End"))