R:如何仅在第二个空格后分隔值
我有一个不同名称的列:R:如何仅在第二个空格后分隔值,r,separator,strsplit,R,Separator,Strsplit,我有一个不同名称的列: X <- c("Ashley, Tremond WILLIAMS, Carla", "Claire, Daron", "Luw, Douglas CANSLER, Stephan") 但它除以所有的空间,所以我得到: strsplit(X, "\\,\\s|\\,|\\s") [[1]] [1] "Ashley" "Tremond" "WILLIAMS" "Carla" [[2]] [1] "Claire" "Daron" [[3]] [1] "
X <- c("Ashley, Tremond WILLIAMS, Carla", "Claire, Daron", "Luw, Douglas CANSLER, Stephan")
但它除以所有的空间,所以我得到:
strsplit(X, "\\,\\s|\\,|\\s")
[[1]]
[1] "Ashley" "Tremond" "WILLIAMS" "Carla"
[[2]]
[1] "Claire" "Daron"
[[3]]
[1] "Luw" "Douglas" "CANSLER" "Stephan"
我怎样才能在第一个空格后分开,这样我就得到了:
[1] "Ashley, Tremond" "WILLIAMS, Carla"
[[2]]
[1] "Claire, Daron"
[[3]]
[1] "Luw, Douglas" "CANSLER, Stephan"
提前感谢您的帮助当然@ytk的评论是有效的,但如果您想避免使用正则表达式, 你可以鬼鬼祟祟的去做
df2 <- df %>%
separate(col = X, into=c("person1a","person1b","person2a","person2b"),sep= " ") %>%
unite(col = "person1", person1a, person1b, sep=" ") %>%
unite(col = "person2", person2a, person2b, sep=" ")
p、 我使用
dfstrsplit(X,“[^,]”
给出所需的输出。它在空格前面没有逗号的地方拆分字符串。您需要取消列出它以维护向量:unlist(strsplit(X,split=“[a-z][a-z]”)
@RyanMorton,如果您跳过unlist
调用,它将保留原始输入中名称的分组级别,并且与预期结果相匹配预期结果在我最初的回复后被编辑到帖子中,但是是的。strsplit()返回一个列表。@ykt和Ryan非常感谢您的帮助,它可以工作。谢谢,但我写的代码完全一样,它对我不起作用,我不太理解它,%>%的意思是什么?@NataliaP这是一种管道语法,请查看magrittr
包。
df2 <- df %>%
separate(col = X, into=c("person1a","person1b","person2a","person2b"),sep= " ") %>%
unite(col = "person1", person1a, person1b, sep=" ") %>%
unite(col = "person2", person2a, person2b, sep=" ")
> df2
person1 person2
1 Ashley, Tremond WILLIAMS, Carla
2 Claire, Daron NA NA
3 Luw, Douglas CANSLER, Stephan