提取字符串并填充到r中的其他列
我在r中有一个数据框,看起来像这样提取字符串并填充到r中的其他列,r,string,substring,extract,R,String,Substring,Extract,我在r中有一个数据框,看起来像这样 df<-data.frame(matrix(NA, nrow = 4, ncol = 4)) df[,1]<-c("472=20140112224524497,5752=122524,223=ZHRR6,69=0," ,"472=20140112224606569,223=BNCG6,315=CC26R,69=22," ,"50=986,472=20140112224607924,223=ZHCG6,69=98,"
df<-data.frame(matrix(NA, nrow = 4, ncol = 4))
df[,1]<-c("472=20140112224524497,5752=122524,223=ZHRR6,69=0,"
,"472=20140112224606569,223=BNCG6,315=CC26R,69=22,"
,"50=986,472=20140112224607924,223=ZHCG6,69=98,"
,"66=2315,472=20140112224502367,379=2016,223=CMCG9,69=274,")
df我们可以使用str\u extract
指定模式,以匹配紧跟在正则表达式后面的数字(\\d+
),该正则表达式在数字(\\d
)后面紧跟着=
library(stringr)
df[-1] <- do.call(rbind, str_extract_all(df$X1, "(?<=\\d\\=)[^,]+"))
对于每一行,请尝试使用带逗号的strsplit,或者您还需要应用,或者它可以按行运行
library(stringr)
df[-1] <- do.call(rbind, str_extract_all(df$X1, "(?<=\\d\\=)[^,]+"))
library(data.table)
setDT(df)[, (2:4) := tstrsplit(X1, "\\d+=|,")[c(FALSE, TRUE)]]