在特定字符之间键入分隔符（dplyr）（空格后和大写字母前）_R_Regex_Dplyr

在特定字符之间键入分隔符（dplyr）（空格后和大写字母前）

r regex

在特定字符之间键入分隔符（dplyr）（空格后和大写字母前）,r,regex,dplyr,R,Regex,Dplyr,我想用减号分隔-必须在空格后和大写字母之前我的正则表达式[\s]-[A-Z]包含空格和大写字母，因此通过分隔来删除。我只想在那个特定的位置用减号来分隔，而不是去掉空格和后面的字母 library(dplyr) data.frame(x = c("Hans-Peter Wurst -My Gosh", "What is -wrong here -Do not worry")) %>% separate(x, into = c("one", "two"), sep = "[\\s]-

我想用减号分隔-必须在空格后和大写字母之前

我的正则表达式[\s]-[A-Z]包含空格和大写字母，因此通过分隔来删除。我只想在那个特定的位置用减号来分隔，而不是去掉空格和后面的字母

library(dplyr)

data.frame(x = c("Hans-Peter Wurst -My Gosh", "What is -wrong here -Do not worry")) %>% 
  separate(x, into = c("one", "two"), sep = "[\\s]-[A-Z]")

结果：

#                   one         two
# 1    Hans-Peter Wurst      y Gosh
# 2 What is -wrong here o not worry

预期的产出将是：

#                   one          two
# 1    Hans-Peter Wurst      My Gosh
# 2 What is -wrong here Do not worry

您可以将大写字母模式包装在lookback/lookahead中

sep = "(?<!\\S)-(?=[A-Z])"

见

由于零宽度断言不使用文本，它们匹配的文本不在整体匹配值内，因此它只检查模式是否匹配并返回true或false，字母将保留在输出中

细节

? --连字符？=[A-Z]-正向前瞻，要求在当前位置右侧立即使用大写ASCII字母。我们可以使用提取，将角色作为一个组捕获。。。通过将那些不需要的字符保留在括号外

library(tidyverse)
data.frame(x = c("Hans-Peter Wurst -My Gosh", 
               "What is -wrong here -Do not worry")) %>%
     extract(x, into = c("one", "two"), "(.*) -([^-]+)$")
#                 one          two
#1    Hans-Peter Wurst      My Gosh
#2 What is -wrong here Do not worry

如果要相信OP的文字，而不是去掉空格和下面的字母，可能还需要使用“向后看”来查找空格。@Gregor Fair，我用替换了\s+？与字符串开头或空格后的位置匹配的？好吧，我重新打开了它，因为这些谈话让我头疼

library(tidyverse)
data.frame(x = c("Hans-Peter Wurst -My Gosh", 
               "What is -wrong here -Do not worry")) %>%
     extract(x, into = c("one", "two"), "(.*) -([^-]+)$")
#                 one          two
#1    Hans-Peter Wurst      My Gosh
#2 What is -wrong here Do not worry