在R中查找每个组的重叠时间线
R大师 我需要您的帮助,使用tidyverse/dplyr确定R中的多个重叠时间线 以下是数据集:在R中查找每个组的重叠时间线,r,dplyr,tidyverse,R,Dplyr,Tidyverse,R大师 我需要您的帮助,使用tidyverse/dplyr确定R中的多个重叠时间线 以下是数据集: library(tidyverse) library(googleVis) df <- data.frame(Student = structure(c(rep("Allan",5), rep("Joan",5), rep("Kat", 5)), class = "character"), Course = c(LETTERS[1:5], LETTERS[
library(tidyverse)
library(googleVis)
df <- data.frame(Student = structure(c(rep("Allan",5), rep("Joan",5), rep("Kat", 5)), class = "character"),
Course = c(LETTERS[1:5], LETTERS[1:5], LETTERS[1:5]),
Start = structure(c(16713,16768,16725,16758,16780,
16714,16754,16765,16729,16785,
16724,16730,16755,16760,16759), class = "Date"),
End = structure(c(16733,16775,16755,16779,16790,
16744,16762,16780,16760,16795,
16744,16750,16758,16784,16798), class = "Date"))
plot(gvisTimeline(data=df, rowlabel = "Course",
start = "Start", end = "End",
options=list(width=600, height=1000) ))
我想在重叠列中计算以下结果
df$overlap <- c("AC","BD","AC","BD","",
"AD","BD","","ABD","",
"AB","AB","","DE","DE")
df
Student Course Start End overlap
1 Allan A 2015-10-05 2015-10-25 AC
2 Allan B 2015-11-29 2015-12-06 BD
3 Allan C 2015-10-17 2015-11-16 AC
4 Allan D 2015-11-19 2015-12-10 BD
5 Allan E 2015-12-11 2015-12-21
6 Joan A 2015-10-06 2015-11-05 AD
7 Joan B 2015-11-15 2015-11-23 BD
8 Joan C 2015-11-26 2015-12-11
9 Joan D 2015-10-21 2015-11-21 ABD
10 Joan E 2015-12-16 2015-12-26
11 Kat A 2015-10-16 2015-11-05 AB
12 Kat B 2015-10-22 2015-11-11 AB
13 Kat C 2015-11-16 2015-11-19
14 Kat D 2015-11-21 2015-12-15 DE
15 Kat E 2015-11-20 2015-12-29 DE
衷心感谢您的时间和帮助 使用map2_chr的解决方案:
如果需要,可以将重叠列中的单个字符条目替换为空白。使用map2\u chr的解决方案:
如果需要,您可以将重叠列中的单个字符条目替换为空白。您可以通过lubridate设置间隔对象,并使用int\u overlaps测试两个间隔是否重叠
library(tidyverse)
library(lubridate)
df %>%
group_by(Student) %>%
mutate(overlap = map_chr(interval(Start, End),
~ toString(Course[int_overlaps(., interval(Start, End))])))
# Student Course Start End overlap
# <fct> <fct> <date> <date> <chr>
# 1 Allan A 2015-10-05 2015-10-25 A, C
# 2 Allan B 2015-11-29 2015-12-06 B, D
# 3 Allan C 2015-10-17 2015-11-16 A, C
# 4 Allan D 2015-11-19 2015-12-10 B, D
# 5 Allan E 2015-12-11 2015-12-21 E
# 6 Joan A 2015-10-06 2015-11-05 A, D
# 7 Joan B 2015-11-15 2015-11-23 B, D
# 8 Joan C 2015-11-26 2015-12-11 C
# 9 Joan D 2015-10-21 2015-11-21 A, B, D
# 10 Joan E 2015-12-16 2015-12-26 E
# 11 Kat A 2015-10-16 2015-11-05 A, B
# 12 Kat B 2015-10-22 2015-11-11 A, B
# 13 Kat C 2015-11-16 2015-11-19 C
# 14 Kat D 2015-11-21 2015-12-15 D, E
# 15 Kat E 2015-11-20 2015-12-29 D, E
您可以通过lubridate设置间隔对象,并使用int_overlaps测试两个间隔是否重叠
library(tidyverse)
library(lubridate)
df %>%
group_by(Student) %>%
mutate(overlap = map_chr(interval(Start, End),
~ toString(Course[int_overlaps(., interval(Start, End))])))
# Student Course Start End overlap
# <fct> <fct> <date> <date> <chr>
# 1 Allan A 2015-10-05 2015-10-25 A, C
# 2 Allan B 2015-11-29 2015-12-06 B, D
# 3 Allan C 2015-10-17 2015-11-16 A, C
# 4 Allan D 2015-11-19 2015-12-10 B, D
# 5 Allan E 2015-12-11 2015-12-21 E
# 6 Joan A 2015-10-06 2015-11-05 A, D
# 7 Joan B 2015-11-15 2015-11-23 B, D
# 8 Joan C 2015-11-26 2015-12-11 C
# 9 Joan D 2015-10-21 2015-11-21 A, B, D
# 10 Joan E 2015-12-16 2015-12-26 E
# 11 Kat A 2015-10-16 2015-11-05 A, B
# 12 Kat B 2015-10-22 2015-11-11 A, B
# 13 Kat C 2015-11-16 2015-11-19 C
# 14 Kat D 2015-11-21 2015-12-15 D, E
# 15 Kat E 2015-11-20 2015-12-29 D, E
琼D也和A相配。