R:strsplit基于两个条件,保持除沫器

R:strsplit基于两个条件,保持除沫器,r,string,split,strsplit,R,String,Split,Strsplit,我试图根据不同的标准来划分句子。我希望在“牵引”之后拆分一些句子,在“拉马塞”之后拆分一些句子。我查阅了格雷普的语法规则,但没有真正理解 名为export的数据框有一列ref,该列的str值以“牵引力”或“ramasse”结尾 我想把ref列中的str值一分为二 ref R&T [1] "62133130_074_" "traction" [2] "62156438_074_" "ramasse" [3] "621538

我试图根据不同的标准来划分句子。我希望在“牵引”之后拆分一些句子,在“拉马塞”之后拆分一些句子。我查阅了格雷普的语法规则,但没有真正理解

名为
export
的数据框有一列
ref
,该列的str值以“牵引力”或“ramasse”结尾

我想把ref列中的str值一分为二

                ref           R&T
[1] "62133130_074_"    "traction"
[2] "62156438_074_"     "ramasse"
[3]  "62153874_070_"    "ramasse"
[4] "62138861_074_"    "traction"
我所尝试的(没有一个是好的)

strsplit(出口$ref,c(“牵引力”、“拉马塞”))

strsplit(export$ref,“\\\\”(这里有一种不同的方法:

strsplit(x, "_(?=[^_]+$)", perl = TRUE)

[[1]]
[1] "62133130_074" "traction"    

[[2]]
[1] "62156438_074" "ramasse"     

[[3]]
[1] "62153874_070" "ramasse"     

[[4]]
[1] "62138861_074" "traction"

这意味着在下划线(“u”)处拆分列/向量,该下划线后跟不包含其他下划线的任意数量的符号。

这里是使用
stringr::str\u拆分的另一个选项:

library(stringr);
str_split(ref, pattern = "_(?=[A-Za-z]+)", simplify = T)
#    [,1]           [,2]
#[1,] "62133130_074" "traction"
#[2,] "62156438_074" "ramasse"
#[3,] "62153874_070" "ramasse"
#[4,] "62138861_074" "traction"

样本数据
ref
strsplit(x, "_(?=[^_]+$)", perl = TRUE)

[[1]]
[1] "62133130_074" "traction"    

[[2]]
[1] "62156438_074" "ramasse"     

[[3]]
[1] "62153874_070" "ramasse"     

[[4]]
[1] "62138861_074" "traction"
library(stringr);
str_split(ref, pattern = "_(?=[A-Za-z]+)", simplify = T)
#    [,1]           [,2]
#[1,] "62133130_074" "traction"
#[2,] "62156438_074" "ramasse"
#[3,] "62153874_070" "ramasse"
#[4,] "62138861_074" "traction"
ref <- c(
    "62133130_074_traction",
    "62156438_074_ramasse",
    "62153874_070_ramasse",
    "62138861_074_traction")