R 如何用看起来像Aa*的单词分隔字符串？_R_Regex

R 如何用看起来像Aa*的单词分隔字符串？

r regex

R 如何用看起来像Aa*的单词分隔字符串？,r,regex,R,Regex,我有这样一个df： df<-structure(list(col3 = c("Text or A ny V alue", "Text or A ny V alue", "Text or A ny V alue", "Categorical select multiple", "Categorical select one (nominal) 3", "Categorical s

我有这样一个df：

df<-structure(list(col3 = c("Text or A ny V alue", "Text or A ny V alue", 
"Text or A ny V alue", "Categorical select multiple", "Categorical select one (nominal) 3", 
"Categorical select one (nominal) 13", "Categorical select one (nominal) 71", 
"CHMUNIT Text or A ny V alue", "Categorical select one (nominal) 71", 
"Text or A ny V alue", "Categorical select one (nominal) 3", 
"Categorical select one (nominal) 3", "Categorical select one (nominal) 3", 
"Text or A ny V alue", "Categorical yes/no (dichotomous) 3", 
"Text or A ny V alue", "Categorical select one (nominal) 3", 
"Categorical select one (nominal) 71", "DSMETA DT Date", "DSMETA ST Text or A ny V alue", 
"Categorical yes/no (dichotomous) 3", "DSPA THDT Date", "Categorical yes/no (dichotomous) 3", 
"Text or A ny V alue", "Text or A ny V alue", "Text or A ny V alue", 
"Categorical yes/no (dichotomous) 3", "Categorical yes/no (dichotomous) 3", 
"Categorical select one (nominal) 71", "V DCO V O S Text or A ny V alue", 
"V DCO V O S Text or A ny V alue", "V DCO V O S Text or A ny V alue", 
"Categorical select multiple 44", "Categorical select one (nominal) 3"
)), row.names = c(NA, -34L), class = "data.frame")

df我们可以在这里使用sub
和grepl
：
df$New\u Var一个更简单的解决方案是：
library(stringr)
df$NewVar <- str_extract(df$col3, "^[A-Z\\s]{2,}(?![a-z])")

好主意@Stataq\1
（或者在其他编程语言中有时是$1
）指的是在regex中定义的第一个捕获组。在本例中，它是从每个匹配开始的所有大写单词的序列。顺便说一句，后续问题的模式可能是：^.*（[A-Z]+\\b（？[A-Z]+\\b）*）$
。。。这将捕获所有出现在某个专栏末尾的大写字母。非常感谢。^
不是字符串的开头吗？这需要我们得到一个从头到尾都是大写的完整col2吗？如果你想使用一个捕获组，那么你需要从一开始就匹配整个列，即使你真的只想捕获结尾。请注意，你的正则表达式模式也会匹配像CASH$。。。
df
                                  col3       NewVar
1                  Text or A ny V alue         <NA>
2                  Text or A ny V alue         <NA>
3                  Text or A ny V alue         <NA>
4          Categorical select multiple         <NA>
5   Categorical select one (nominal) 3         <NA>
6  Categorical select one (nominal) 13         <NA>
7  Categorical select one (nominal) 71         <NA>
8          CHMUNIT Text or A ny V alue     CHMUNIT 
9  Categorical select one (nominal) 71         <NA>
10                 Text or A ny V alue         <NA>
11  Categorical select one (nominal) 3         <NA>
12  Categorical select one (nominal) 3         <NA>
13  Categorical select one (nominal) 3         <NA>
14                 Text or A ny V alue         <NA>
15  Categorical yes/no (dichotomous) 3         <NA>
16                 Text or A ny V alue         <NA>
17  Categorical select one (nominal) 3         <NA>
18 Categorical select one (nominal) 71         <NA>
19                      DSMETA DT Date   DSMETA DT 
20       DSMETA ST Text or A ny V alue   DSMETA ST 
21  Categorical yes/no (dichotomous) 3         <NA>
22                      DSPA THDT Date   DSPA THDT 
23  Categorical yes/no (dichotomous) 3         <NA>
24                 Text or A ny V alue         <NA>
25                 Text or A ny V alue         <NA>
26                 Text or A ny V alue         <NA>
27  Categorical yes/no (dichotomous) 3         <NA>
28  Categorical yes/no (dichotomous) 3         <NA>
29 Categorical select one (nominal) 71         <NA>
30     V DCO V O S Text or A ny V alue V DCO V O S 
31     V DCO V O S Text or A ny V alue V DCO V O S 
32     V DCO V O S Text or A ny V alue V DCO V O S 
33      Categorical select multiple 44         <NA>
34  Categorical select one (nominal) 3         <NA>