R 日期发生x次,请根据ID号转到下一个可用日期

R 日期发生x次,请根据ID号转到下一个可用日期,r,dataframe,R,Dataframe,几天前,我发布了以下问题: 对于给定的数据帧,我得到了一个很好的解决方案,但这是一个示例数据集,其中日期有序,ID有序。y是ID变量: dput(T0range) structure(list(Included.y = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 16, 17, 18, 19, 20), V1 = structure(c(18708, 18708, 18708, 18708, 18708, 18708, 18709, 18709, 18709, 1870

几天前,我发布了以下问题:

对于给定的数据帧,我得到了一个很好的解决方案,但这是一个示例数据集,其中日期有序,ID有序。y是ID变量:

dput(T0range)
structure(list(Included.y = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
16, 17, 18, 19, 20), V1 = structure(c(18708, 18708, 18708, 18708, 
18708, 18708, 18709, 18709, 18709, 18709, 18715, 18715, 18715, 
18715, 18715), class = "Date"), V2 = structure(c(18709, 18709, 
18709, 18709, 18709, 18709, 18710, 18710, 18710, 18710, 18716, 
18716, 18716, 18716, 18716), class = "Date"), V3 = structure(c(18710, 
18710, 18710, 18710, 18710, 18710, 18711, 18711, 18711, 18711, 
18717, 18717, 18717, 18717, 18717), class = "Date"), V4 = structure(c(18711, 
18711, 18711, 18711, 18711, 18711, NA, NA, NA, NA, 18718, 18718, 
18718, 18718, 18718), class = "Date"), V5 = structure(c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), class = "Date"), V6 = structure(c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), 
    V7 = structure(c(NA, NA, NA, NA, NA, NA, 18715, 18715, 18715, 
    18715, NA, NA, NA, NA, NA), class = "Date"), V8 = structure(c(18715, 
    18715, 18715, 18715, 18715, 18715, 18716, 18716, 18716, 18716, 
    NA, NA, NA, NA, NA), class = "Date"), V9 = structure(c(18716, 
    18716, 18716, 18716, 18716, 18716, 18717, 18717, 18717, 18717, 
    18723, 18723, 18723, 18723, 18723), class = "Date"), V10 = structure(c(18717, 
    18717, 18717, 18717, 18717, 18717, 18718, 18718, 18718, 18718, 
    18724, 18724, 18724, 18724, 18724), class = "Date"), V11 = structure(c(18718, 
    18718, 18718, 18718, 18718, 18718, NA, NA, NA, NA, 18725, 
    18725, 18725, 18725, 18725), class = "Date"), V12 = structure(c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), class = "Date"), V13 = structure(c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), class = "Date"), V14 = structure(c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), class = "Date")), row.names = c(NA, 
-15L), class = "data.frame")
我的第一个问题提供的解决方案与此示例非常吻合,并为我提供了所需的输出:

dput(df1)
structure(list(Included.y = 1:15, V1 = structure(c(18708, 18708, 
18708, 18709, 18709, 18709, NA, NA, NA, NA, NA, NA, 18715, 18715, 
18715), class = "Date"), V2 = structure(c(NA, NA, NA, NA, NA, 
NA, 18710, 18710, 18710, NA, NA, NA, NA, NA, NA), class = "Date"), 
    V3 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 18711, 
    18711, 18711, NA, NA, NA), class = "Date")), row.names = c(NA, 
-15L), class = c("tbl_df", "tbl", "data.frame"))

但是,这个解决方案根本不考虑ID变量,只考虑日期的顺序。如果参与者ID的列位于其旁边,则上述解决方案将非常有效。 我需要ID列,因为在现实中,一些ID将不包括在内,并且日期将不符合顺序。前20行的真实数据示例:

dput:

在我想要的输出中,我希望获得计划日期旁边的所有参与者ID。每个日期最多应出现3次:

structure(list(Included.y = c(72, 108, 165, 205, 472, 530, 574, 
750, 1, 2, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47), V1 = structure(c(18918, 
18918, NA, NA, 18919, 18918, NA, NA, 18793, NA, NA, NA, 18800, 
NA, NA, 18841, 18953, NA, NA, NA), class = "Date"), V2 = structure(c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, 18904, NA, NA, NA, NA, NA, 
NA, NA, NA, 18890), class = "Date"), V3 = structure(c(NA, NA, 
18919, NA, NA, NA, NA, NA, NA, 18911, NA, NA, NA, NA, 18820, 
NA, NA, 18855, 18911, NA), class = "Date"), V4 = structure(c(NA, 
NA, NA, 18919, NA, NA, 18981, 18981, NA, NA, NA, 18974, NA, 18932, 
NA, NA, NA, NA, NA, NA), class = "Date"), V5 = c(NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA), V6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA), V7 = c(NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), V8 = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), V9 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA), V10 = c(NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), V11 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA), V12 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), V13 = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), V14 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -20L
), class = c("tbl_df", "tbl", "data.frame"))
请注意,如果一个参与者的所有14个可用日期都已全部预订,因此每个日期都已由以前的参与者填写,则所有列都将保持NA


我真的希望我说得足够清楚,否则请让我知道我能做些什么来让它更清楚。非常感谢您的帮助

这似乎提供了问题中说明的解决方案

很可能有更优雅的方法可以通过矢量化代码实现这一点;但我只能通过循环得到一个解决方案

图书馆弹琴 图书馆三年 可供图书馆使用 原始数据的副本,添加了rowids,以便以后将处理后的数据按原始顺序排列。 df0% rowid_到_列%>% 选择rowid,包括。y 为循环准备长格式的数据,以便根据需要按日期和id提取行 df1% pivot_longer-Included.y,name_to=vis,value_to=date%>% na.省略%>% 安排日期,vis,包括。y 为符合条件的数据初始化TIBLE 包括df2.y V1 V3 V4 > >172 2021-10-18不适用 >2 108 2021-10-18不适用 >3165北美2021-10-19北美 >4205北美2021-10-19 >5472021-10-19不适用 >6530 2021-10-18不适用
该解决方案由v2.0.0创建于2021年4月10日,由Hello Peter创建,即使使用由大约1000名参与者组成的数据帧,也能很好地工作。我遇到了一个问题,虽然循环绝对不是我的专业领域。。我现在已经测试了大约850名参与者的循环,测试日期从现在起到两个月后。我会假设,由于距离较短,一些参与者无法获得分配给他们的日期。在本例中运行循环时,它会继续运行,因为我相信变量i永远不会到达nrowdf。有没有办法解释这一点?谢谢如果不了解实际数据的性质,就很难说如何最好地处理这些数据。实际数据集中有多少行?正如你所说,最有可能的情况是,如果一些参与者无法指定日期,那么我同意循环不会结束。这就变成了另一个问题!如果您可以缩小代码挂起的时间范围,以便识别导致循环挂起的条件,那么可以调整代码来管理它。数据约为840行。df1的seq_len为6688,当变量i为625时,它似乎被卡住了。我调整了,而我的最佳选择是找到导致这些问题的一小部分数据,检查数据是否与此问题中的数据不同,可能是数据需要清理,或者循环需要调整,以允许出现当前最小数据集无法解释的情况。
structure(list(Included.y = c(72, 108, 165, 205, 472, 530, 574, 
750, 1, 2, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47), V1 = structure(c(18918, 
18918, NA, NA, 18919, 18918, NA, NA, 18793, NA, NA, NA, 18800, 
NA, NA, 18841, 18953, NA, NA, NA), class = "Date"), V2 = structure(c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, 18904, NA, NA, NA, NA, NA, 
NA, NA, NA, 18890), class = "Date"), V3 = structure(c(NA, NA, 
18919, NA, NA, NA, NA, NA, NA, 18911, NA, NA, NA, NA, 18820, 
NA, NA, 18855, 18911, NA), class = "Date"), V4 = structure(c(NA, 
NA, NA, 18919, NA, NA, 18981, 18981, NA, NA, NA, 18974, NA, 18932, 
NA, NA, NA, NA, NA, NA), class = "Date"), V5 = c(NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA), V6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA), V7 = c(NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), V8 = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), V9 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA), V10 = c(NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), V11 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA), V12 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), V13 = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), V14 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -20L
), class = c("tbl_df", "tbl", "data.frame"))