替换R中某些文本的正则表达式
我正在处理data.csv文件,我需要处理某些模式的数据。当前,my data.csv文件中的类colu如下所示:替换R中某些文本的正则表达式,r,regex,R,Regex,我正在处理data.csv文件,我需要处理某些模式的数据。当前,my data.csv文件中的类colu如下所示: org.apache.camel.bam.TimeExpression.evaluate(TimeExpression.java org.apache.camel.bam.rules.TemporalRule.processExchange(TemporalRule.java org.apache.camel.bam.rules.ActivityRules.pro
org.apache.camel.bam.TimeExpression.evaluate(TimeExpression.java
org.apache.camel.bam.rules.TemporalRule.processExchange(TemporalRule.java
org.apache.camel.bam.rules.ActivityRules.processExchange(ActivityRules.java
org.apache.camel.bam.rules.ProcessRules.processExchange(ProcessRules.java
org.apache.camel.bam.processor.JpaBamProcessor.processEntity(JpaBamProcessor.java
org.apache.camel.bam.processor.JpaBamProcessor.processEntity(JpaBamProcessor.java
现在,我需要用“.java”文本替换括号前面的文本。在这种情况下,我想要的输出应该是:
org.apache.camel.bam.TimeExpression.java
org.apache.camel.bam.rules.TemporalRule.java
org.apache.camel.bam.rules.ActivityRules.java
org.apache.camel.bam.rules.ProcessRules.java
org.apache.camel.bam.processor.JpaBamProcessor.java
org.apache.camel.bam.processor.JpaBamProcessor.java
目前,我正在尝试以下代码:
dscls<-gsub("\\.[^.]+($", "java", data$class)
dscls我们可以使用sub
匹配单词(\\w+
),后跟(
后跟另一个单词(\\w+
)和一个点(\.
),将其替换为空白(“
)
数据
data这里df$x有您共享的数据
gsub("\\w+\\(.*", "java", df$x)
[1] "org.apache.camel.bam.TimeExpression.java" "org.apache.camel.bam.rules.TemporalRule.java"
[3] "org.apache.camel.bam.rules.ActivityRules.java" "org.apache.camel.bam.rules.ProcessRules.java"
[5] "org.apache.camel.bam.processor.JpaBamProcessor.java" "org.apache.camel.bam.processor.JpaBamProcessor.java"
由于您已经有了以.java结尾的字符串(至少在示例中是这样),您也可以尝试以下方法:
strs <- c('org.apache.camel.bam.TimeExpression.evaluate(TimeExpression.java','org.apache.camel.bam.rules.TemporalRule.processExchange(TemporalRule.java','org.apache.camel.bam.rules.ActivityRules.processExchange(ActivityRules.java','org.apache.camel.bam.rules.ProcessRules.processExchange(ProcessRules.java','org.apache.camel.bam.processor.JpaBamProcessor.processEntity(JpaBamProcessor.java','org.apache.camel.bam.processor.JpaBamProcessor.processEntity(JpaBamProcessor.java')
gsub('\\.\\w+\\(\\w+(\\.java)', '\\1', strs)
#[1] "org.apache.camel.bam.TimeExpression.java"
#[2] "org.apache.camel.bam.rules.TemporalRule.java"
#[3] "org.apache.camel.bam.rules.ActivityRules.java"
#[4] "org.apache.camel.bam.rules.ProcessRules.java"
#[5] "org.apache.camel.bam.processor.JpaBamProcessor.java"
#[6] "org.apache.camel.bam.processor.JpaBamProcessor.java"
strs所以,它会产生类似“org.apache.camel.bam.processor.JpaBamProcessor..java”的输出。如何删除java前面出现的额外“.”?
gsub("\\w+\\(.*", "java", df$x)
[1] "org.apache.camel.bam.TimeExpression.java" "org.apache.camel.bam.rules.TemporalRule.java"
[3] "org.apache.camel.bam.rules.ActivityRules.java" "org.apache.camel.bam.rules.ProcessRules.java"
[5] "org.apache.camel.bam.processor.JpaBamProcessor.java" "org.apache.camel.bam.processor.JpaBamProcessor.java"
strs <- c('org.apache.camel.bam.TimeExpression.evaluate(TimeExpression.java','org.apache.camel.bam.rules.TemporalRule.processExchange(TemporalRule.java','org.apache.camel.bam.rules.ActivityRules.processExchange(ActivityRules.java','org.apache.camel.bam.rules.ProcessRules.processExchange(ProcessRules.java','org.apache.camel.bam.processor.JpaBamProcessor.processEntity(JpaBamProcessor.java','org.apache.camel.bam.processor.JpaBamProcessor.processEntity(JpaBamProcessor.java')
gsub('\\.\\w+\\(\\w+(\\.java)', '\\1', strs)
#[1] "org.apache.camel.bam.TimeExpression.java"
#[2] "org.apache.camel.bam.rules.TemporalRule.java"
#[3] "org.apache.camel.bam.rules.ActivityRules.java"
#[4] "org.apache.camel.bam.rules.ProcessRules.java"
#[5] "org.apache.camel.bam.processor.JpaBamProcessor.java"
#[6] "org.apache.camel.bam.processor.JpaBamProcessor.java"