当无法识别日期格式时,如何减去R中的日期列?

当无法识别日期格式时,如何减去R中的日期列?,r,excel,dataframe,date,type-conversion,R,Excel,Dataframe,Date,Type Conversion,我事先尝试过以下方法,但没有成功: 预期结果:症状日期和住院日期之间的时间间隔,以天为单位,例如 (2020-06-07)-(2020-06-27)=20天 所以输出看起来像 [1] 20 等等 任何光线都将不胜感激 以下是dput: dput(t1) 结构(list)(data_int_uti=structure)(c)(18420,18512,18300,, 18489, 18489, 18492, 18499, 18503, 18443, 18389, 18458, 18488, 182

我事先尝试过以下方法,但没有成功:

预期结果:症状日期和住院日期之间的时间间隔,以天为单位,例如

(2020-06-07)-(2020-06-27)=20天

所以输出看起来像 [1] 20 等等

任何光线都将不胜感激

以下是dput:

dput(t1) 结构(list)(data_int_uti=structure)(c)(18420,18512,18300,, 18489, 18489, 18492, 18499, 18503, 18443, 18389, 18458, 18488, 18270, 18514, 18605, 18299, 18512, 18300, 18489, 18301, 18420, 18420, 18472, 18490, 18495, 18496, 18268, 18359, 18299, 18270, 18420, 18513, 18543, 18488, 18501, 18504, 18459, 18466, 18519, 18268, 18359, 18450, 18450, 18542, 18468, 18329, 18505, 18299, 18420, 18487, 18498, 18268, 18461, 18520, 18268, 18450, 18460, 18470, 18488, 18544, 18519, 18360, 18268, 18268, 18603, 18470, 1836018490182991845018463184931833018391),class=“日期”), data_inicio_sint=结构(c(18440,NA,18472,NA,18474, 18493, 18496, 18492, 18442, 18438, 18457, 18391, 18271, 18505, 18504, 18441, 18513, 18466, 18605, 18496, 18360, 18438, 18463, 18490, 18605, 18497, 18443, 18439, 18352, 18497, 18434, 18472, 18472, 18475, 18493, 18488, 18443, 18465, 18515, 18444, 18300, 18421, 18436, 18440, 18460, 18472, 18505, 18433, 18329, 18301, 18498, 18269, 18421, 18545, 18436, 18390, 18543, 18467, 18452, 18503, 18545, 18301, 18436, 18435, 18482, 18464, 18471, 18391, 184321845118457185441827118605),class=“Date”)),row.names=c(NA, -74L),类=c(“待定”、“待定”、“数据帧”))


一种解决方案是使用dplyr并转换为日期

library(dplyr)
# example data
t1 <- data.frame(date_admission = c("2020-08-07","2020-07-31","2020-02-08","2020-08-15","2020-08-17","2020-08-24","2020-08-27","2020-10-09","2020-01-07"),
             date_symptoms = c( "2020-06-27", NA           ,"2020-07-29", NA          , "2020-07-31", "2020-08-19", "2020-08-22", "2020-08-18", "2020-06-29"))

# calculation (convert all columns to date and substract according to your example)
t1 %>% 
   dplyr::mutate_all(~ as.Date(.)) %>% 
   dplyr::mutate(DIF = date_admission - date_symptoms)
库(dplyr)
#示例数据
t1%
dplyr::mutate_all(~as.Date(.))%>%
dplyr::突变(DIF=入院日期-症状日期)

diff
是计算日期差的错误函数。你可以直接减去日期

t1$date_admission - t1$date_symptoms
#Time differences in days
# [1]  -20   NA -172   NA   15   -1    3   11    1  -49    1   97   -1    9  101
#[16] -142   -1 -166 -116 -195   60  -18    9    0 -110   -1 -175  -80  -53 -227
#[31]  -14   41   71   13    8   16   16    1    4 -176   59   29   14  102    8
#[46] -143    0 -134   91  186    0   -1   40  -25 -168   60  -83    3   36   41
#[61]  -26   59 -168 -167  121    6 -111   99 -133   -1    6  -51   59 -214
您可能正在尝试使用
difftime

difftime(t1$date_admission, t1$date_symptoms, units = "days")

diff
函数减去连续值。例如,见:

diff(c(5, 9, 4, 5))
#[1]  4 -5  1

其中计算为
(9-5=4)
(4-9=-5)
(5-4=1)
。在您的例子中,您首先减去日期,然后对其应用
diff
,以获得连续数字之间的差值。

就这样。非常感谢。嘿。我试过使用
diftime
t1$date\u入院-t1$date\u症状
,但都不起作用。它们返回的值与帖子中的值完全相同(-20,NA,-172等等)@dairelix您可以用预期的输出更新帖子吗。您共享的
dput
与您显示的数据不同
t1$date\u入院[1]
在您的数据中是
“2020-06-07”
,而您已将其显示为
(2020-08-07)
您能再次检查您的数据吗
t1$data\u int\u uti[1]
仍然是
“2020-06-07”
。数据中没有任何变化。@dairelix For me
t1$data\u inicio\u sint-t1$data\u int\u uti
给出了
20 NA 172 NA-15….
这不是你想要的吗?与
difftime
相同的数字。
difftime(t1$date_admission, t1$date_symptoms, units = "days")
diff(c(5, 9, 4, 5))
#[1]  4 -5  1