当无法识别日期格式时,如何减去R中的日期列?
我事先尝试过以下方法,但没有成功: 预期结果:症状日期和住院日期之间的时间间隔,以天为单位,例如 (2020-06-07)-(2020-06-27)=20天 所以输出看起来像 [1] 20 等等 任何光线都将不胜感激 以下是dput: dput(t1) 结构(list)(data_int_uti=structure)(c)(18420,18512,18300,, 18489, 18489, 18492, 18499, 18503, 18443, 18389, 18458, 18488, 18270, 18514, 18605, 18299, 18512, 18300, 18489, 18301, 18420, 18420, 18472, 18490, 18495, 18496, 18268, 18359, 18299, 18270, 18420, 18513, 18543, 18488, 18501, 18504, 18459, 18466, 18519, 18268, 18359, 18450, 18450, 18542, 18468, 18329, 18505, 18299, 18420, 18487, 18498, 18268, 18461, 18520, 18268, 18450, 18460, 18470, 18488, 18544, 18519, 18360, 18268, 18268, 18603, 18470, 1836018490182991845018463184931833018391),class=“日期”), data_inicio_sint=结构(c(18440,NA,18472,NA,18474, 18493, 18496, 18492, 18442, 18438, 18457, 18391, 18271, 18505, 18504, 18441, 18513, 18466, 18605, 18496, 18360, 18438, 18463, 18490, 18605, 18497, 18443, 18439, 18352, 18497, 18434, 18472, 18472, 18475, 18493, 18488, 18443, 18465, 18515, 18444, 18300, 18421, 18436, 18440, 18460, 18472, 18505, 18433, 18329, 18301, 18498, 18269, 18421, 18545, 18436, 18390, 18543, 18467, 18452, 18503, 18545, 18301, 18436, 18435, 18482, 18464, 18471, 18391, 184321845118457185441827118605),class=“Date”)),row.names=c(NA, -74L),类=c(“待定”、“待定”、“数据帧”))当无法识别日期格式时,如何减去R中的日期列?,r,excel,dataframe,date,type-conversion,R,Excel,Dataframe,Date,Type Conversion,我事先尝试过以下方法,但没有成功: 预期结果:症状日期和住院日期之间的时间间隔,以天为单位,例如 (2020-06-07)-(2020-06-27)=20天 所以输出看起来像 [1] 20 等等 任何光线都将不胜感激 以下是dput: dput(t1) 结构(list)(data_int_uti=structure)(c)(18420,18512,18300,, 18489, 18489, 18492, 18499, 18503, 18443, 18389, 18458, 18488, 182
一种解决方案是使用dplyr并转换为日期
library(dplyr)
# example data
t1 <- data.frame(date_admission = c("2020-08-07","2020-07-31","2020-02-08","2020-08-15","2020-08-17","2020-08-24","2020-08-27","2020-10-09","2020-01-07"),
date_symptoms = c( "2020-06-27", NA ,"2020-07-29", NA , "2020-07-31", "2020-08-19", "2020-08-22", "2020-08-18", "2020-06-29"))
# calculation (convert all columns to date and substract according to your example)
t1 %>%
dplyr::mutate_all(~ as.Date(.)) %>%
dplyr::mutate(DIF = date_admission - date_symptoms)
库(dplyr)
#示例数据
t1%
dplyr::mutate_all(~as.Date(.))%>%
dplyr::突变(DIF=入院日期-症状日期)
diff
是计算日期差的错误函数。你可以直接减去日期
t1$date_admission - t1$date_symptoms
#Time differences in days
# [1] -20 NA -172 NA 15 -1 3 11 1 -49 1 97 -1 9 101
#[16] -142 -1 -166 -116 -195 60 -18 9 0 -110 -1 -175 -80 -53 -227
#[31] -14 41 71 13 8 16 16 1 4 -176 59 29 14 102 8
#[46] -143 0 -134 91 186 0 -1 40 -25 -168 60 -83 3 36 41
#[61] -26 59 -168 -167 121 6 -111 99 -133 -1 6 -51 59 -214
您可能正在尝试使用difftime
:
difftime(t1$date_admission, t1$date_symptoms, units = "days")
diff
函数减去连续值。例如,见:
diff(c(5, 9, 4, 5))
#[1] 4 -5 1
其中计算为
(9-5=4)
,(4-9=-5)
和(5-4=1)
。在您的例子中,您首先减去日期,然后对其应用diff
,以获得连续数字之间的差值。就这样。非常感谢。嘿。我试过使用diftime
和t1$date\u入院-t1$date\u症状
,但都不起作用。它们返回的值与帖子中的值完全相同(-20,NA,-172等等)@dairelix您可以用预期的输出更新帖子吗。您共享的dput
与您显示的数据不同t1$date\u入院[1]
在您的数据中是“2020-06-07”
,而您已将其显示为(2020-08-07)
您能再次检查您的数据吗t1$data\u int\u uti[1]
仍然是“2020-06-07”
。数据中没有任何变化。@dairelix For met1$data\u inicio\u sint-t1$data\u int\u uti
给出了20 NA 172 NA-15….
这不是你想要的吗?与difftime
相同的数字。
difftime(t1$date_admission, t1$date_symptoms, units = "days")
diff(c(5, 9, 4, 5))
#[1] 4 -5 1