R:strsplit基于两个条件,保持除沫器
我试图根据不同的标准来划分句子。我希望在“牵引”之后拆分一些句子,在“拉马塞”之后拆分一些句子。我查阅了格雷普的语法规则,但没有真正理解 名为R:strsplit基于两个条件,保持除沫器,r,string,split,strsplit,R,String,Split,Strsplit,我试图根据不同的标准来划分句子。我希望在“牵引”之后拆分一些句子,在“拉马塞”之后拆分一些句子。我查阅了格雷普的语法规则,但没有真正理解 名为export的数据框有一列ref,该列的str值以“牵引力”或“ramasse”结尾 我想把ref列中的str值一分为二 ref R&T [1] "62133130_074_" "traction" [2] "62156438_074_" "ramasse" [3] "621538
export
的数据框有一列ref
,该列的str值以“牵引力”或“ramasse”结尾
我想把ref列中的str值一分为二
ref R&T
[1] "62133130_074_" "traction"
[2] "62156438_074_" "ramasse"
[3] "62153874_070_" "ramasse"
[4] "62138861_074_" "traction"
我所尝试的(没有一个是好的)
strsplit(出口$ref,c(“牵引力”、“拉马塞”))
strsplit(export$ref,“\\\\”(这里有一种不同的方法:
strsplit(x, "_(?=[^_]+$)", perl = TRUE)
[[1]]
[1] "62133130_074" "traction"
[[2]]
[1] "62156438_074" "ramasse"
[[3]]
[1] "62153874_070" "ramasse"
[[4]]
[1] "62138861_074" "traction"
这意味着在下划线(“u”)处拆分列/向量,该下划线后跟不包含其他下划线的任意数量的符号。这里是使用stringr::str\u拆分的另一个选项:
library(stringr);
str_split(ref, pattern = "_(?=[A-Za-z]+)", simplify = T)
# [,1] [,2]
#[1,] "62133130_074" "traction"
#[2,] "62156438_074" "ramasse"
#[3,] "62153874_070" "ramasse"
#[4,] "62138861_074" "traction"
样本数据
ref
strsplit(x, "_(?=[^_]+$)", perl = TRUE)
[[1]]
[1] "62133130_074" "traction"
[[2]]
[1] "62156438_074" "ramasse"
[[3]]
[1] "62153874_070" "ramasse"
[[4]]
[1] "62138861_074" "traction"
library(stringr);
str_split(ref, pattern = "_(?=[A-Za-z]+)", simplify = T)
# [,1] [,2]
#[1,] "62133130_074" "traction"
#[2,] "62156438_074" "ramasse"
#[3,] "62153874_070" "ramasse"
#[4,] "62138861_074" "traction"
ref <- c(
"62133130_074_traction",
"62156438_074_ramasse",
"62153874_070_ramasse",
"62138861_074_traction")