R按此格式拆分列_R_Split_Strsplit

R按此格式拆分列

R按此格式拆分列,r,split,strsplit,R,Split,Strsplit,我需要把这一栏分成两栏 2020年5月5日特斯拉期望的结果是 Col1 Col2 2020年5月5日汤姆·特斯拉我尝试过strAny，但需要帮助，因为Col 1不是固定的，因为date字段的长度因月日的1或2个字符而不同。有什么建议吗？我们可以使用separate和regex lookaround将数字和小写字母分开 library(tidyr) separate(df1, 'col1', into = c('date', 'other'), sep="(?<=[0

我需要把这一栏分成两栏

2020年5月5日特斯拉

期望的结果是

Col1 Col2
2020年5月5日汤姆·特斯拉

我尝试过strAny，但需要帮助，因为Col 1不是固定的，因为date字段的长度因月日的1或2个字符而不同。

有什么建议吗？

我们可以使用

separate

和regex lookaround将数字和小写字母分开

library(tidyr)
separate(df1, 'col1', into = c('date', 'other'), sep="(?<=[0-9])(?=[A-Za-z])")
#     date             other
#1  1/1/2000            yogurt
#2  1/1/2000      toilet paper
#3  2/1/2000              soda
#4 11/1/2000            bagels
#5 12/1/2000            fruits
#6 13/1/2000 laundry detergent

数据

df1我们可以使用带有regex lookaround的separate
在数字和小写字母之间进行拆分
library(tidyr)
separate(df1, 'col1', into = c('date', 'other'), sep="(?<=[0-9])(?=[A-Za-z])")
#     date             other
#1  1/1/2000            yogurt
#2  1/1/2000      toilet paper
#3  2/1/2000              soda
#4 11/1/2000            bagels
#5 12/1/2000            fruits
#6 13/1/2000 laundry detergent

数据
df1这里有两种方法：
使用提取自tidyr
：
tidyr::extract(df, col1, c('col1', 'col2'), regex = '(.*\\d)(.*)')

library(dplyr)
library(stringr)

df %>%
  mutate(col2 = str_extract(col1, '\\d+/\\d+/\\d+'), 
         col3 = str_remove(col1, col2))


或使用dplyr
和stringr
：
tidyr::extract(df, col1, c('col1', 'col2'), regex = '(.*\\d)(.*)')

library(dplyr)
library(stringr)

df %>%
  mutate(col2 = str_extract(col1, '\\d+/\\d+/\\d+'), 
         col3 = str_remove(col1, col2))

这里有两种方法：
使用提取自tidyr
：
tidyr::extract(df, col1, c('col1', 'col2'), regex = '(.*\\d)(.*)')

library(dplyr)
library(stringr)

df %>%
  mutate(col2 = str_extract(col1, '\\d+/\\d+/\\d+'), 
         col3 = str_remove(col1, col2))


或使用dplyr
和stringr
：
tidyr::extract(df, col1, c('col1', 'col2'), regex = '(.*\\d)(.*)')

library(dplyr)
library(stringr)

df %>%
  mutate(col2 = str_extract(col1, '\\d+/\\d+/\\d+'), 
         col3 = str_remove(col1, col2))