R 如何从列中分离时间和日期
我有一个数据集。它有一个包含日期和时间的列。我们能把这个分开吗R 如何从列中分离时间和日期,r,R,我有一个数据集。它有一个包含日期和时间的列。我们能把这个分开吗 df ColA ColB A 2020-01-17T03:30:37-05:00 B 2020-01-17T03:30:38-05:00 C 2020-01-17T03:30:39-05:00 预期产量 df ColA ColB ColC ColD ColE A 2020-01-
df
ColA ColB
A 2020-01-17T03:30:37-05:00
B 2020-01-17T03:30:38-05:00
C 2020-01-17T03:30:39-05:00
预期产量
df
ColA ColB ColC ColD ColE
A 2020-01-17T03:30:37-05:00 2020-01-17 03:30:37 05:00
B 2020-01-17T03:30:38-05:00 2020-01-17 03:30:38 05:00
C 2020-01-17T03:30:39-05:00 2020-01-17 03:30:39 05:00
假设
ColB
列是纯文本,您可以尝试只获取子字符串:
df$ColC <- substr(df$ColB, 1, 10)
df$ColD <- substr(df$ColB, 12, 19)
df$ColE <- substr(df$ColB, 21, 25)
df
ColA ColB ColC ColD ColE
1 A 2020-01-17T03:30:37-05:00 2020-01-17 03:30:37 05:00
2 B 2020-01-17T03:30:38-05:00 2020-01-17 03:30:38 05:00
3 C 2020-01-17T03:30:39-05:00 2020-01-17 03:30:39 05:00
df$ColC我们可以使用tidyr::extract
并使用适当的正则表达式将数据拆分为3列
tidyr::extract(df, ColB, c("ColC", "ColD", "ColE"), regex = "(.*)T(.*)-(.*)",
remove = FALSE)
# ColA ColB ColC ColD ColE
#1 A 2020-01-17T03:30:37-05:00 2020-01-17 03:30:37 05:00
#2 B 2020-01-17T03:30:38-05:00 2020-01-17 03:30:38 05:00
#3 C 2020-01-17T03:30:39-05:00 2020-01-17 03:30:39 05:00
数据
df <- structure(list(ColA = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), ColB = structure(1:3, .Label = c("2020-01-17T03:30:37-05:00",
"2020-01-17T03:30:38-05:00", "2020-01-17T03:30:39-05:00"), class = "factor")),
class = "data.frame", row.names = c(NA, -3L))
df
df <- structure(list(ColA = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), ColB = structure(1:3, .Label = c("2020-01-17T03:30:37-05:00",
"2020-01-17T03:30:38-05:00", "2020-01-17T03:30:39-05:00"), class = "factor")),
class = "data.frame", row.names = c(NA, -3L))