如何在R中合并两个时间序列数据?
我有两个时间序列数据。第一个数据包括药物名称、开始使用日期、停止使用日期和药物剂量,第二个数据包括就诊日期和得分如何在R中合并两个时间序列数据?,r,date,merge,time-series,R,Date,Merge,Time Series,我有两个时间序列数据。第一个数据包括药物名称、开始使用日期、停止使用日期和药物剂量,第二个数据包括就诊日期和得分 data1<- data.frame("Drug Name" = c("Drug1","Drug1","Drug2","Drug1","Drug3","Drug2", &
data1<- data.frame("Drug Name" = c("Drug1","Drug1","Drug2","Drug1","Drug3","Drug2",
"Drug4","Drug5","Drug1"),
"Start Date" = c("7/1/2016","1/1/2016", "8/6/2015","2/1/2015","6/14/2017",
"6/21/2017","1/24/2018","7/30/2018","7/30/2018"),
"Stop Date "=c("1/14/2017","1/14/2017", "1/14/2017","1/14/2017"
,"1/24/2018","6/29/2018","6/29/2018","Ongoing","Ongoing"),
"Dose"=c(12,20,32,3,5,6,6,8,9))
data2<-data.frame("visitdate"=c("8/24/2016","8/24/2016", "10/19/2016","12/7/2016","12/21/2016",
"3/22/2017","5/10/2017", "6/14/2017", "7/12/2017","9/27/2017",
"11/29/2017", "1/24/2018","3/21/2018","5/30/2018","8/15/2018",
"10/3/2018", "11/28/2018"),
"Score"=c(1,2,3,34,6,7,9,5,6,8,5,5,7,9,1,2,5))
早期可能有一些数据预处理。
首先,上面的示例中有带空格的列名,最好避免使用。我为这个例子编辑并删除了空格 此外,您还有“进行中”作为日期。建议使用as.Date
转换为日期。但是,转换后,带有“正在进行”的将被包括为NA
。可以将它们设置为Inf(无限大),这将起作用
例如:
data1$StartDate <- as.Date(data1$StartDate, format = "%m/%d/%Y")
data1$StopDate <- as.Date(data1$StopDate, format = "%m/%d/%Y")
data2$VisitDate <- as.Date(data2$VisitDate, format = "%m/%d/%Y")
data1$StopDate[8:9] <- Inf
输出
VisitDate Score NumDrugs DrugName_1 Dose_1 DrugName_2 Dose_2 DrugName_3 Dose_3 DrugName_4 Dose_4
<date> <dbl> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl>
1 2016-08-24 1 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
2 2016-08-24 2 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
3 2016-10-19 3 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
4 2016-12-07 34 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
5 2016-12-21 6 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
6 2017-03-22 7 0 NA NA NA NA NA NA NA NA
7 2017-05-10 9 0 NA NA NA NA NA NA NA NA
8 2017-06-14 5 1 Drug3 5 NA NA NA NA NA NA
9 2017-07-12 6 2 Drug3 5 Drug2 6 NA NA NA NA
10 2017-09-27 8 2 Drug3 5 Drug2 6 NA NA NA NA
11 2017-11-29 5 2 Drug3 5 Drug2 6 NA NA NA NA
12 2018-01-24 5 3 Drug3 5 Drug2 6 Drug4 6 NA NA
13 2018-03-21 7 2 Drug2 6 Drug4 6 NA NA NA NA
14 2018-05-30 9 2 Drug2 6 Drug4 6 NA NA NA NA
15 2018-08-15 1 2 Drug5 8 Drug1 9 NA NA NA NA
16 2018-10-03 2 2 Drug5 8 Drug1 9 NA NA NA NA
17 2018-11-28 5 2 Drug5 8 Drug1 9 NA NA NA NA
VisitDate Score NumDrugs DrugName\u 1 Dose\u 1 DrugName\u 2 Dose\u 2 DrugName\u 3 Dose\u 3 DrugName\u 4 Dose\u 4
1 2016-08-24药物12药物20药物32药物3
2 2016-08-24药物12药物20药物32药物3
3 2016-10-19药物12药物20药物32药物3
4 2016-12-07 34药物12药物20药物32药物3
5 2016-12-21 4药物12药物20药物32药物3
6 2017-03-22 7不适用不适用不适用不适用
7 2017-05-10 9不适用不适用不适用不适用
8 2017年6月14日药物不适用
9 2017年7月12日药物不适用
10 2017年9月27日药物
11 2017-11-29 5 2药物2 5药物2 6钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠钠
12 2018-01-24 5药物6 NA
13 2018-03-21药物不适用
14 2018-05-30药物不适用
15 2018-08-15药物不适用
16 2018-10-03 2药物2不适用
2018年11月28日药物不适用
数据
(转换日期前)
data1您想要时间序列还是数据帧(作为数据结构)?因为在你的问题中,你有两个数据框。没有关于患者的专栏,这可能是需要的。我想要一个像我上传的图片一样的数据框。这两个数据属于一名患者,他多次就诊。在我的主数据集中,有100个观察值。如何得到5分?分数是第二个数据中的一列。
VisitDate Score NumDrugs DrugName_1 Dose_1 DrugName_2 Dose_2 DrugName_3 Dose_3 DrugName_4 Dose_4
<date> <dbl> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl>
1 2016-08-24 1 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
2 2016-08-24 2 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
3 2016-10-19 3 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
4 2016-12-07 34 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
5 2016-12-21 6 4 Drug1 12 Drug1 20 Drug2 32 Drug1 3
6 2017-03-22 7 0 NA NA NA NA NA NA NA NA
7 2017-05-10 9 0 NA NA NA NA NA NA NA NA
8 2017-06-14 5 1 Drug3 5 NA NA NA NA NA NA
9 2017-07-12 6 2 Drug3 5 Drug2 6 NA NA NA NA
10 2017-09-27 8 2 Drug3 5 Drug2 6 NA NA NA NA
11 2017-11-29 5 2 Drug3 5 Drug2 6 NA NA NA NA
12 2018-01-24 5 3 Drug3 5 Drug2 6 Drug4 6 NA NA
13 2018-03-21 7 2 Drug2 6 Drug4 6 NA NA NA NA
14 2018-05-30 9 2 Drug2 6 Drug4 6 NA NA NA NA
15 2018-08-15 1 2 Drug5 8 Drug1 9 NA NA NA NA
16 2018-10-03 2 2 Drug5 8 Drug1 9 NA NA NA NA
17 2018-11-28 5 2 Drug5 8 Drug1 9 NA NA NA NA
data1 <- structure(list(DrugName = c("Drug1", "Drug1", "Drug2", "Drug1",
"Drug3", "Drug2", "Drug4", "Drug5", "Drug1"), StartDate = c("7/1/2016",
"1/1/2016", "8/6/2015", "2/1/2015", "6/14/2017", "6/21/2017",
"1/24/2018", "7/30/2018", "7/30/2018"), StopDate = c("1/14/2017",
"1/14/2017", "1/14/2017", "1/14/2017", "1/24/2018", "6/29/2018",
"6/29/2018", NA, NA), Dose = c(12, 20, 32, 3, 5, 6, 6, 8, 9)), class = "data.frame", row.names = c(NA,
-9L))
data2 <- structure(list(VisitDate = c("8/24/2016", "8/24/2016", "10/19/2016",
"12/7/2016", "12/21/2016", "3/22/2017", "5/10/2017", "6/14/2017",
"7/12/2017", "9/27/2017", "11/29/2017", "1/24/2018", "3/21/2018",
"5/30/2018", "8/15/2018", "10/3/2018", "11/28/2018"), Score = c(1,
2, 3, 34, 6, 7, 9, 5, 6, 8, 5, 5, 7, 9, 1, 2, 5)), class = "data.frame", row.names = c(NA,
-17L))