如何将一串不同长度的数字和字母分隔成R中的不同列?
我有一个名为“WFBS”的列,其中包含超过一百万行不同长度的字符串,如下所示:如何将一串不同长度的数字和字母分隔成R中的不同列?,r,string,R,String,我有一个名为“WFBS”的列,其中包含超过一百万行不同长度的字符串,如下所示: WFBS <- c("M010203", "S01020304", "N104509") WFBS这可能是一个有用的起点: library(tidyr) df <- data.frame(WFBS = c("M010203", "S01020304", "N104509"), stringsAsFactors = FALSE) > df %>% separ
WFBS <- c("M010203", "S01020304", "N104509")
WFBS这可能是一个有用的起点:
library(tidyr)
df <- data.frame(WFBS = c("M010203", "S01020304", "N104509"),
stringsAsFactors = FALSE)
> df %>% separate(col = WFBS,
into = c("WFBS1","WFBS2","WFBS3","WFBS4"),
sep = c(3,5,7))
WFBS1 WFBS2 WFBS3 WFBS4
1 M01 02 03
2 S01 02 03 04
3 N10 45 09
library(tidyr)
df%>%单独(col=WFBS,
into=c(“WFBS1”、“WFBS2”、“WFBS3”、“WFBS4”),
sep=c(3,5,7))
WFBS1 WFBS2 WFBS3 WFBS4
1 M01 02 03
2 S01 02 03 04
3 N10 45 09
这将使您在剩余的点中保留空字符串,而不是NAs,您必须对其进行转换。带有基本R
的选项,但创建分隔符,
使用子,使用读取.csv
读取以创建4列数据.frame
read.csv(text = sub("^(...)(..)(..)(.*)", "\\1,\\2,\\3,\\4", WFBS),
header = FALSE, colClasses = rep("character", 4), na.strings = "",
col.names =paste0("WFBS", 1:4), stringsAsFactors = FALSE)
# WFBS1 WFBS2 WFBS3 WFBS4
#1 M01 02 03 <NA>
#2 S01 02 03 04
#3 N10 45 09 <NA>
read.csv(text=sub(“^(…)(…)(…)(…)(*”),“\\1,\\2,\\3,\\4”,WFBS),
header=FALSE,colClasses=rep(“字符”,4),na.strings=“”,
col.names=paste0(“WFBS”,1:4),stringsAsFactors=FALSE)
#WFBS1 WFBS2 WFBS3 WFBS4
#1 M01 02 03
#2 S01 02 03 04
#3 N10 45 09
library(tidyr)
df <- data.frame(WFBS = c("M010203", "S01020304", "N104509"),
stringsAsFactors = FALSE)
> df %>% separate(col = WFBS,
into = c("WFBS1","WFBS2","WFBS3","WFBS4"),
sep = c(3,5,7))
WFBS1 WFBS2 WFBS3 WFBS4
1 M01 02 03
2 S01 02 03 04
3 N10 45 09
read.csv(text = sub("^(...)(..)(..)(.*)", "\\1,\\2,\\3,\\4", WFBS),
header = FALSE, colClasses = rep("character", 4), na.strings = "",
col.names =paste0("WFBS", 1:4), stringsAsFactors = FALSE)
# WFBS1 WFBS2 WFBS3 WFBS4
#1 M01 02 03 <NA>
#2 S01 02 03 04
#3 N10 45 09 <NA>