处理R中错误日期的最佳方法

处理R中错误日期的最佳方法,r,date,R,Date,我需要将下面示例代码中的数据框df转换为带有df$dates%%as.date df <- structure(list(dates = structure(c(19L, 18L, 17L, 16L, 14L, 13L, 12L, 11L, 9L, 8L, 7L, 6L, 21L, 20L, 15L, 10L, 5L, 4L, 3L, 2L, 1L), .Label = c("2014-12-31", "2015-06-30", "20

我需要将下面示例代码中的数据框
df
转换为带有
df$dates%%as.date

df <- structure(list(dates = structure(c(19L, 18L, 17L, 16L, 14L, 13L, 
12L, 11L, 9L, 8L, 7L, 6L, 21L, 20L, 15L, 10L, 5L, 4L, 3L, 2L, 
1L), .Label = c("2014-12-31", "2015-06-30", "2015-12-31", "2016-06-30", 
"2016-12-31", "2017-01-31", "2017-03-31", "2017-06-30", "2017-09-31", 
"2017-12-31", "2018-01-31", "2018-03-31", "2018-06-30", "2018-09-31", 
"2018-12-31", "2019-01-31", "2019-03-31", "2019-06-30", "2019-09-31", 
"2019-12-31", "2020-06-30"), class = "factor")), class = "data.frame", row.names = c(NA, 
-21L))
df%as.Date
在charToDate(x)中给出错误
错误:字符串不是标准的明确格式


由于无法将所有不正确日期转换为日期对象,我如何才能最好地将其增量到下个月的第一个日期?

dplyr::coalesce
返回第一个非NA,因此如果您知道某些日期未解析的原因是因为它们比月底晚了一天,您可以有选择地在下个月的第一天替换这些

library(lubridate); library(dplyr)

okay_dates <- ymd(df$dates)
next_mo <- ymd(paste(substr(df$dates, 1, 7), "01")) %>% ceiling_date("month")
coalesce(okay_dates, next_mo)

 [1] "2019-10-01" "2019-06-30" "2019-03-31" "2019-01-31" "2018-10-01" "2018-06-30" "2018-03-31"
 [8] "2018-01-31" "2017-10-01" "2017-06-30" "2017-03-31" "2017-01-31" "2020-06-30" "2019-12-31"
[15] "2018-12-31" "2017-12-31" "2016-12-31" "2016-06-30" "2015-12-31" "2015-06-30" "2014-12-31"
库(lubridate);图书馆(dplyr)

好的,第一步是在as.Date中指定格式字符串。这将返回不正确日期的NA,从而识别它们。然后您可以使用ifelse和一些文本处理来更正它们。@Roland您确定吗?尝试
as.Date(“2021-02-31”)
。设置格式:
as.Date(“2021-02-31”,format=“%Y-%m-%d”)
它生成了一个
NA
非常好,谢谢