如何根据R中的日期范围获取值
我有两个DF df1和df2。 df1有2个记录日期(基准日期和后续日期), (场景1)首先,我需要将确切的记录_date1与药物_Date匹配,如果匹配,则药物名称应更新为相应的日期(即PID=345)。 (场景2)如果日期不匹配,我必须根据日期范围获取PID的最小药物日期,如(其中记录日期1(-7天)和记录日期1(+45天)) 这里我给出了下面的示例集和预期输出如何根据R中的日期范围获取值,r,dplyr,R,Dplyr,我有两个DF df1和df2。 df1有2个记录日期(基准日期和后续日期), (场景1)首先,我需要将确切的记录_date1与药物_Date匹配,如果匹配,则药物名称应更新为相应的日期(即PID=345)。 (场景2)如果日期不匹配,我必须根据日期范围获取PID的最小药物日期,如(其中记录日期1(-7天)和记录日期1(+45天)) 这里我给出了下面的示例集和预期输出 PID Record_Date_1 D1 Record_Date_2 D2 123 22-04-1996 5.3
PID Record_Date_1 D1 Record_Date_2 D2
123 22-04-1996 5.3 30-10-1996 5.4
234 16-06-1994 6.8 13-12-1994 7.2
345 18-09-2000 7.5 24-02-2001 8.9
456 20-02-2001 8.5 20-08-2001 9.4
PID Drug_Date Drugs
123 23-04-1996 Biguanides
123 28-04-1996 Sulphynureas
123 31-10-1996 SGLT2
234 15-06-1994 Insulin
234 14-12-1994 Biguanides
345 18-09-2000 DPP4-inhibitor
345 24-02-2001 Incretin
456 21-02-2001 Biguanides
456 26-08-2001 Sulphynureas
预期产出:
PID Record Date D1 Record Date_2 D2 Drug_ Date1 D1_Drugs Drug_ Date2 D2_Drugs
123 22-04-1996 5.3 30-10-1996 5.4 23-04-1996 Biguanides 31-10-1996 sulphynureas
234 16-06-1994 6.8 13-12-1994 7.2 15-06-1994 Insulin 14-12-1994 Biguanides
345 18-09-2000 7.5 24-02-2001 8.9 18-09-2000 DPP4-inhibitor 24-02-2001 Incretin
456 20-02-2001 8.5 20-08-2001 9.4 21-02-2001 Biguanides 26-08-2001 sulphynureas
如果您需要任何澄清,请让我知道。
提前谢谢 考虑这样一个函数
my_match <- function(x, y) {
f <- function(i, j) {
pos <- which(j >= i - 7L & j <= i + 45L)
pos[[which.min(j[pos])]]
}
x <- as.Date(x, "%d-%m-%Y")
y <- as.Date(y, "%d-%m-%Y")
out <- match(x, y)
ifelse(is.na(out), vapply(x, f, integer(1L), y), out)
}
数据(df1
)
数据(df2
)
df1$Drug_Date1 <- df2$Drug_Date[my_match(df1$Record_Date_1, df2$Drug_Date)]
df1$D1_Drug <- df2$Drugs[my_match(df1$Record_Date_1, df2$Drug_Date)]
df1$Drug_Date2 <- df2$Drug_Date[my_match(df1$Record_Date_2, df2$Drug_Date)]
df1$D2_Drug <- df2$Drugs[my_match(df1$Record_Date_2, df2$Drug_Date)]
> as.data.frame(df1)
PID Record_Date_1 D1 Record_Date_2 D2 Drug_Date1 D1_Drug Drug_Date2 D2_Drug
1 123 22-04-1996 5.3 30-10-1996 5.4 23-04-1996 Biguanides 31-10-1996 SGLT2
2 234 16-06-1994 6.8 13-12-1994 7.2 15-06-1994 Insulin 14-12-1994 Biguanides
3 345 18-09-2000 7.5 24-02-2001 8.9 18-09-2000 DPP4-inhibitor 24-02-2001 Incretin
4 456 20-02-2001 8.5 20-08-2001 9.4 21-02-2001 Biguanides 26-08-2001 Sulphynureas
structure(list(PID = c(123, 234, 345, 456), Record_Date_1 = c("22-04-1996",
"16-06-1994", "18-09-2000", "20-02-2001"), D1 = c(5.3, 6.8, 7.5,
8.5), Record_Date_2 = c("30-10-1996", "13-12-1994", "24-02-2001",
"20-08-2001"), D2 = c(5.4, 7.2, 8.9, 9.4), Drug_Date1 = c("23-04-1996",
"15-06-1994", "18-09-2000", "21-02-2001"), D1_Drug = c("Biguanides",
"Insulin", "DPP4-inhibitor", "Biguanides"), Drug_Date2 = c("31-10-1996",
"14-12-1994", "24-02-2001", "26-08-2001"), D2_Drug = c("SGLT2",
"Biguanides", "Incretin", "Sulphynureas")), row.names = c(NA,
-4L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"), spec = structure(list(
cols = list(PID = structure(list(), class = c("collector_double",
"collector")), Record_Date_1 = structure(list(), class = c("collector_character",
"collector")), D1 = structure(list(), class = c("collector_double",
"collector")), Record_Date_2 = structure(list(), class = c("collector_character",
"collector")), D2 = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 2), class = "col_spec"))
structure(list(PID = c(123, 123, 123, 234, 234, 345, 345, 456,
456), Drug_Date = c("23-04-1996", "28-04-1996", "31-10-1996",
"15-06-1994", "14-12-1994", "18-09-2000", "24-02-2001", "21-02-2001",
"26-08-2001"), Drugs = c("Biguanides", "Sulphynureas", "SGLT2",
"Insulin", "Biguanides", "DPP4-inhibitor", "Incretin", "Biguanides",
"Sulphynureas")), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -9L), spec = structure(list(cols = list(
PID = structure(list(), class = c("collector_double", "collector"
)), Drug_Date = structure(list(), class = c("collector_character",
"collector")), Drugs = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 2), class = "col_spec"))