R 拆分字符串，但保留某些子字符串_R_Dplyr_Tidyr_Strsplit

R 拆分字符串，但保留某些子字符串

R 拆分字符串，但保留某些子字符串,r,dplyr,tidyr,strsplit,R,Dplyr,Tidyr,Strsplit,我想用某些分隔字符（即空格、逗号和分号）分隔数据帧中的字符列。但是，我想从拆分中排除某些短语（在我的示例中，我想排除“我的测试”）我设法将普通字符串拆分，但不知道如何排除某些短语 library(tidyverse) test <- data.frame(string = c("this is a,test;but I want to exclude my test", "this is another;of my tests

我想用某些分隔字符（即空格、逗号和分号）分隔数据帧中的字符列。但是，我想从拆分中排除某些短语（在我的示例中，我想排除“我的测试”）

我设法将普通字符串拆分，但不知道如何排除某些短语

library(tidyverse)

test <- data.frame(string = c("this is a,test;but I want to exclude my test",
                              "this is another;of my tests",
                              "this is my 3rd test"),
                   stringsAsFactors = FALSE)

test %>%
  mutate(new_string = str_split(test$string, pattern = " |,|;")) %>%
  unnest_wider(new_string)

库（tidyverse）
测试%
mutate（new_string=str_split（test$string，pattern=“|，|”）%>%
unnest_加宽（新_字符串）

这使得：

# A tibble: 3 x 12
  string                                       ...1  ...2  ...3    ...4  ...5  ...6  ...7  ...8  ...9    ...10 ...11
  <chr>                                        <chr> <chr> <chr>   <chr> <chr> <chr> <chr> <chr> <chr>   <chr> <chr>
1 this is a,test;but I want to exclude my test this  is    a       test  but   I     want  to    exclude my    test 
2 this is another;of my tests                  this  is    another of    my    tests NA    NA    NA      NA    NA   
3 this is my 3rd test                          this  is    my      3rd   test  NA    NA    NA    NA      NA    NA

#一个tible:3 x 12
字符串…1…2…3…4…5…6…7…8…9…10…11
1这是一个测试；但是我想排除我的测试这是一个测试但是我想排除我的测试
2这是另一个；在我的测试中，这是我的另一个测试不，不，不，不，不
这是我的第三次测试这是我的第三次测试不，不，不，不，不

但是，我希望的输出是（不包括“我的测试”）：

#一个tible:3 x 12
字符串…1…2…3…4…5…6…7…8…9…10
1这是一个测试；但是我想排除我的测试这是一个测试但是我想排除我的测试
2这是另一个；在我的测试中，这是我的另一个测试不，不，不，不，不
这是我的第三次测试这是我的第三次测试不，不，不，不

有什么想法吗？（附带问题：你知道我如何在最不宽泛的东西中命名列吗？

一个简单的方法是添加一个

\u

，然后删除它：

test %>%
  mutate(string = gsub("my test", "my_test", string),
    new_string = str_split(string, pattern = "[ ,;]")) %>%
  unnest_wider(new_string) %>%
  mutate_all(~ gsub("my_test", "my test", .x))

为了给列提供更有意义的名称，您可以使用

pivot\u更宽的

中的其他选项

test %>%
  mutate(string = gsub("my test", "my_test", string),
    new_string = str_split(string, pattern = "[ ,;]")) %>%
  unnest_wider(new_string) %>%
  mutate_all(~ gsub("my_test", "my test", .x))