R 保留行直到特定的时间戳,即使最后一个时间戳不存在

R 保留行直到特定的时间戳,即使最后一个时间戳不存在,r,R,具有提供特定时间戳的数据帧 dframe1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L), name = c("Google", "Yahoo", "Amazon", "Amazon", "Google"), date = c("2008-11-01", "2008-11-01", "2008-11-04", "2008-11-01", "2008-11-02")), class = "data.frame", row.names = c(N

具有提供特定时间戳的数据帧

dframe1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L), name = c("Google", 
"Yahoo", "Amazon", "Amazon", "Google"), date = c("2008-11-01", 
"2008-11-01", "2008-11-04", "2008-11-01", "2008-11-02")), class = "data.frame", row.names = c(NA, 
-5L))
使用此代码,仅当它找到时间戳前后两天时,才会保留结果。即使之前和之后的时间戳不存在,但在此之前的所有天都存在,如何更改它并使其保留到之前和之后的两天?从


下次可能会指出您的代码,并包含相关的代码位,因此,例如,您需要先转换数据(如前所述)

因此,使用@IaroslavDomin提供的功能,您需要更改过滤器。我在这里做的和他的有点不同。我直接使用dframe2

X = left_join(dframe1, dframe2, by = "id") %>% 
  mutate(date_diff = as.numeric(date.y - date.x)) %>%
  # change the filter here, >0 means not the same
  # < 2 means within 2 days 
  filter(abs(date_diff)>0 & abs(date_diff)<2 ) %>% 
  mutate(label = ifelse(date_diff <0, "before", "after")) %>% 
  select(id, name, label, text_sth)
如果我们在最后一张桌子前面:

# A tibble: 10 x 4
# Groups:   id, name [5]
      id name   label  test                                  
   <int> <chr>  <chr>  <chr>                                 
 1     1 Amazon after  text here                             
 2     1 Amazon before other                                 
 3     1 Google after  another one test text_sth another text
 4     1 Google before another text other                    
 5     1 Yahoo  after  another one test text_sth another text
 6     1 Yahoo  before another text other                    
 7     2 Amazon after  text here another text                
 8     2 Amazon before etc                                   
 9     2 Google after  text here                             
10     2 Google before test text_sth  
#一个tible:10 x 4
#组:id,名称[5]
id名称标签测试
1在这里输入文本后输入1
2 1亚马逊先于其他
3.一个接一个的谷歌测试文本
4 1在另一个文本之前搜索其他文本
一个接一个的测试文本
6 1 Yahoo在另一个文本之前其他
7.2亚马逊在这里的文字后面另一个文字
8.2亚马逊等
9.2谷歌后文本在这里
10.2测试前用谷歌搜索文本

谢谢。我检查了它,但结果也保留了输入数据框中的值。你是什么意思?我编辑了这篇文章以显示结果。哪一部分与您期望的不同?实际上,如果您使用此选项,您将在特定日期之前再次使用此选项,它不会考虑到,例如,如果我的数据少于4ok,则两天内会有数据,因此您需要之前和之后的所有内容?你的问题不太清楚。您所需的输出为2天
left_join(dframe1, df2, by = "id") %>% 
  mutate(date_diff = as.numeric(date.y - date.x)) %>%
  filter(abs(date_diff) == 2) %>% 
  mutate(label = ifelse(date_diff == -2, "before", "after")) %>% 
  select(id, name, label, text_sth)
dframe1$date = as.Date(dframe1$date)
dframe2$date = as.Date(dframe2$date)
X = left_join(dframe1, dframe2, by = "id") %>% 
  mutate(date_diff = as.numeric(date.y - date.x)) %>%
  # change the filter here, >0 means not the same
  # < 2 means within 2 days 
  filter(abs(date_diff)>0 & abs(date_diff)<2 ) %>% 
  mutate(label = ifelse(date_diff <0, "before", "after")) %>% 
  select(id, name, label, text_sth)
X= X %>% group_by(id,name,label) %>%
summarize(test=paste(unique(text_sth),collapse=" "))
# A tibble: 10 x 4
# Groups:   id, name [5]
      id name   label  test                                  
   <int> <chr>  <chr>  <chr>                                 
 1     1 Amazon after  text here                             
 2     1 Amazon before other                                 
 3     1 Google after  another one test text_sth another text
 4     1 Google before another text other                    
 5     1 Yahoo  after  another one test text_sth another text
 6     1 Yahoo  before another text other                    
 7     2 Amazon after  text here another text                
 8     2 Amazon before etc                                   
 9     2 Google after  text here                             
10     2 Google before test text_sth