R 保留行直到特定的时间戳,即使最后一个时间戳不存在
具有提供特定时间戳的数据帧R 保留行直到特定的时间戳,即使最后一个时间戳不存在,r,R,具有提供特定时间戳的数据帧 dframe1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L), name = c("Google", "Yahoo", "Amazon", "Amazon", "Google"), date = c("2008-11-01", "2008-11-01", "2008-11-04", "2008-11-01", "2008-11-02")), class = "data.frame", row.names = c(N
dframe1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L), name = c("Google",
"Yahoo", "Amazon", "Amazon", "Google"), date = c("2008-11-01",
"2008-11-01", "2008-11-04", "2008-11-01", "2008-11-02")), class = "data.frame", row.names = c(NA,
-5L))
使用此代码,仅当它找到时间戳前后两天时,才会保留结果。即使之前和之后的时间戳不存在,但在此之前的所有天都存在,如何更改它并使其保留到之前和之后的两天?从
下次可能会指出您的代码,并包含相关的代码位,因此,例如,您需要先转换数据(如前所述) 因此,使用@IaroslavDomin提供的功能,您需要更改过滤器。我在这里做的和他的有点不同。我直接使用dframe2
X = left_join(dframe1, dframe2, by = "id") %>%
mutate(date_diff = as.numeric(date.y - date.x)) %>%
# change the filter here, >0 means not the same
# < 2 means within 2 days
filter(abs(date_diff)>0 & abs(date_diff)<2 ) %>%
mutate(label = ifelse(date_diff <0, "before", "after")) %>%
select(id, name, label, text_sth)
如果我们在最后一张桌子前面:
# A tibble: 10 x 4
# Groups: id, name [5]
id name label test
<int> <chr> <chr> <chr>
1 1 Amazon after text here
2 1 Amazon before other
3 1 Google after another one test text_sth another text
4 1 Google before another text other
5 1 Yahoo after another one test text_sth another text
6 1 Yahoo before another text other
7 2 Amazon after text here another text
8 2 Amazon before etc
9 2 Google after text here
10 2 Google before test text_sth
#一个tible:10 x 4
#组:id,名称[5]
id名称标签测试
1在这里输入文本后输入1
2 1亚马逊先于其他
3.一个接一个的谷歌测试文本
4 1在另一个文本之前搜索其他文本
一个接一个的测试文本
6 1 Yahoo在另一个文本之前其他
7.2亚马逊在这里的文字后面另一个文字
8.2亚马逊等
9.2谷歌后文本在这里
10.2测试前用谷歌搜索文本
谢谢。我检查了它,但结果也保留了输入数据框中的值。你是什么意思?我编辑了这篇文章以显示结果。哪一部分与您期望的不同?实际上,如果您使用此选项,您将在特定日期之前再次使用此选项,它不会考虑到,例如,如果我的数据少于4ok,则两天内会有数据,因此您需要之前和之后的所有内容?你的问题不太清楚。您所需的输出为2天
left_join(dframe1, df2, by = "id") %>%
mutate(date_diff = as.numeric(date.y - date.x)) %>%
filter(abs(date_diff) == 2) %>%
mutate(label = ifelse(date_diff == -2, "before", "after")) %>%
select(id, name, label, text_sth)
dframe1$date = as.Date(dframe1$date)
dframe2$date = as.Date(dframe2$date)
X = left_join(dframe1, dframe2, by = "id") %>%
mutate(date_diff = as.numeric(date.y - date.x)) %>%
# change the filter here, >0 means not the same
# < 2 means within 2 days
filter(abs(date_diff)>0 & abs(date_diff)<2 ) %>%
mutate(label = ifelse(date_diff <0, "before", "after")) %>%
select(id, name, label, text_sth)
X= X %>% group_by(id,name,label) %>%
summarize(test=paste(unique(text_sth),collapse=" "))
# A tibble: 10 x 4
# Groups: id, name [5]
id name label test
<int> <chr> <chr> <chr>
1 1 Amazon after text here
2 1 Amazon before other
3 1 Google after another one test text_sth another text
4 1 Google before another text other
5 1 Yahoo after another one test text_sth another text
6 1 Yahoo before another text other
7 2 Amazon after text here another text
8 2 Amazon before etc
9 2 Google after text here
10 2 Google before test text_sth