R 在每个大写字母前加下划线,后跟小写字母
我试图在每个大写字母前加下划线,后跟小写字母。以下是一个例子:R 在每个大写字母前加下划线,后跟小写字母,r,regex,R,Regex,我试图在每个大写字母前加下划线,后跟小写字母。以下是一个例子: cases <- c("XrefAcctnoAcctID", "NewXref1AcctID", "NewXref2AcctID", "ClientNo") 我想要这个: [1] "XrefAcctnoAcctID" "NewXref1AcctID" [3] "NewXref2AcctID" "ClientNo" "xref_acctno_acct_id" "new_xref1_acct_id"
cases <- c("XrefAcctnoAcctID", "NewXref1AcctID", "NewXref2AcctID", "ClientNo")
我想要这个:
[1] "XrefAcctnoAcctID" "NewXref1AcctID"
[3] "NewXref2AcctID" "ClientNo"
"xref_acctno_acct_id"
"new_xref1_acct_id"
"new_xref2_acct_id"
"client_no"
我能做到这一点:
> tolower(gsub("([a-z])([A-Z])", "\\1_\\2", cases))
[1] "xref_acctno_acct_id" "new_xref1acct_id"
[3] "new_xref2acct_id" "client_no"
但是
“new\u xref1acct\u id”
“new\u xref2acct\u id”不能反映我想要的内容。我们可以使用regex lookarounds来匹配显示小写字母或数字后跟大写字母的模式,并将其替换为
tolower(gsub("(?<=[a-z0-9])(?=[A-Z])", "_", cases, perl = TRUE))
#[1] "xref_acctno_acct_id" "new_xref1_acct_id" "new_xref2_acct_id"
#[4] "client_no"
tolower(gsub("([a-z1-9])([A-Z])", "\\1_\\2", cases))
#[1] "xref_acctno_acct_id" "new_xref1_acct_id" "new_xref2_acct_id"
#[4] "client_no"
只需将正则表达式中的
[a-z]
更改为[a-z0-9]
,以匹配大写之前的小写字母或数字。或者更改为()([A-Z])
,以匹配大写字母之前的任何内容。