提取每个id-R的特定注释之前的所有行_R

提取每个id-R的特定注释之前的所有行

提取每个id-R的特定注释之前的所有行,r,R,我试图将每个id的特定注释之前的所有行提取到一个新的数据帧中 EID <- c(1,1,1,1,1,1,2,2,2,3,3,3,3,3) comments <- c("apple", "grape", "banana", "rabbit ", "pine", "mango", "banana", "rabbit ", &quo

我试图将每个id的特定注释之前的所有行提取到一个新的数据帧中

EID <- c(1,1,1,1,1,1,2,2,2,3,3,3,3,3)
comments <- c("apple", "grape", "banana", "rabbit ", "pine", "mango", "banana", "rabbit ", "pine", "apple", "grape", "banana", "rabbit ", "pine")
df <- data.frame(EID, comments)

这里有一个dplyr解决方案，它可以按组抓取兔子前面的每一行。这使用grepl和cumsum进行过滤。另外，请注意，我将grepl与fixed=TRUE而不是==一起使用，因为在您的示例中，rabbit有额外的空格

library(dplyr)

df %>%
  group_by(EID) %>%
  filter(cumsum(grepl("rabbit", comments, fixed = TRUE)) == 0)
# A tibble: 7 x 2
# Groups:   EID [3]
    EID comments
  <dbl> <chr>   
1     1 apple   
2     1 grape   
3     1 banana  
4     2 banana  
5     3 apple   
6     3 grape   
7     3 banana

library(dplyr)

df %>%
  group_by(EID) %>%
  filter(cumsum(grepl("rabbit", comments, fixed = TRUE)) == 0)
# A tibble: 7 x 2
# Groups:   EID [3]
    EID comments
  <dbl> <chr>   
1     1 apple   
2     1 grape   
3     1 banana  
4     2 banana  
5     3 apple   
6     3 grape   
7     3 banana

您还可以尝试为兔子创建一个标志，填充它，然后过滤。下面是使用tidyverse函数的代码：

library(tidyverse)
#Code
df %>% group_by(EID) %>% 
  mutate(Flag=ifelse(comments=='rabbit ',1,NA)) %>%
  fill(Flag,.direction = 'up') %>%
  filter(Flag==1 & comments!='rabbit ') %>% dplyr::select(-c(Flag))

输出：

# A tibble: 7 x 2
# Groups:   EID [3]
    EID comments
  <dbl> <chr>   
1     1 apple   
2     1 grape   
3     1 banana  
4     2 banana  
5     3 apple   
6     3 grape   
7     3 banana

您还可以尝试为兔子创建一个标志，填充它，然后过滤。下面是使用tidyverse函数的代码：

library(tidyverse)
#Code
df %>% group_by(EID) %>% 
  mutate(Flag=ifelse(comments=='rabbit ',1,NA)) %>%
  fill(Flag,.direction = 'up') %>%
  filter(Flag==1 & comments!='rabbit ') %>% dplyr::select(-c(Flag))

输出：

# A tibble: 7 x 2
# Groups:   EID [3]
    EID comments
  <dbl> <chr>   
1     1 apple   
2     1 grape   
3     1 banana  
4     2 banana  
5     3 apple   
6     3 grape   
7     3 banana

获取以下错误您能告诉我错误：筛选器输入有问题..1。x输入..1的大小必须是2或1，而不是14。@priya可能有一些包问题，请尝试dplyr:：filterError:筛选器输入..1有问题。x输入..1的大小必须为2或1，而不是14。i Input..1 is==....@priya您可以使用stryourdataframe共享您的数据结构吗？我怀疑这与数据有关。或者检查plyr之前是否未加载dplyr@priya看起来注释有不同的值！尝试uniqueyourdf$注释并查看值！获取以下错误您能告诉我错误：筛选器输入有问题..1。x输入..1的大小必须是2或1，而不是14。@priya可能有一些包问题，请尝试dplyr:：filterError:筛选器输入..1有问题。x输入..1的大小必须为2或1，而不是14。i Input..1 is==....@priya您可以使用stryourdataframe共享您的数据结构吗？我怀疑这与数据有关。或者检查plyr之前是否未加载dplyr@priya看起来注释有不同的值！尝试uniqueyourdf$注释并查看值！获取以下错误您能告诉我错误：筛选器输入有问题..1。x Input..1的大小必须为2或1，而不是14。我的猜测是：在本例中，它期望索引大小为2，抛出错误的组的大小，或者它可以回收的1，但它得到的索引长度为14。这很奇怪。同意@Duck，看到str可能会有帮助。获得以下错误您能告诉我错误：筛选器输入有问题..1。x Input..1的大小必须为2或1，而不是14。我的猜测是：在本例中，它期望索引大小为2，抛出错误的组的大小，或者它可以回收的1，但它得到的索引长度为14。这很奇怪。同意@Duck，看到str可能会有所帮助。