Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/css/41.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 我能为下面的嵌套函数提供一个简洁的解决方案吗_R - Fatal编程技术网

R 我能为下面的嵌套函数提供一个简洁的解决方案吗

R 我能为下面的嵌套函数提供一个简洁的解决方案吗,r,R,该函数在包含1000到20000个案例的df上运行良好,但除此之外,还需要几个小时(5个多小时),现在我有一个观察长度为57635985的df 假设df如下所示: d<-structure(list(ReviewType= c("Review","Review","Review","Correction","Correction","Review","Revi

该函数在包含1000到20000个案例的df上运行良好,但除此之外,还需要几个小时(5个多小时),现在我有一个观察长度为57635985的df

假设df如下所示:


d<-structure(list(ReviewType= c("Review","Review","Review","Correction","Correction","Review","Review","Review","Review","Review","Correction","Correction","Deficiency","Correction","Correction", 
                                "Correction", "Deficiency", "Deficiency", "Correction","Correction","Deficiency","Correction"),
                  Submissiondate= c("2020-08-29 04:32:00","2020-08-28 04:31:00","2020-08-26 04:31:00","2020-08-25 04:31:00","2020-08-24 04:31:00","2020-08-23 04:31:00","2020-08-22 04:31:00","2020-08-21 04:31:00","2020-08-20 04:31:00","2020-08-19 04:31:00",
                                    "2020-09-27 04:31:00","2020-09-27 03:52:59","2020-09-28 17:30:00","2020-09-29 14:01:00",
                                    "2020-09-05 03:00:00","2020-09-05 03:51:00", "2020-09-03 23:59:49",
                                    "2020-09-02 00:03:54","2020-09-01 00:04:48","2020-10-01 04:31:00","2020-10-11 04:31:00","2020-10-21 04:31:00"),
                  CaseNo= c("124","123","125","121","121","125","123","123","123","123","123","123","123","125","123","123","123","124","123","127","127","127")), class = "data.frame", row.names = c(NA, -22L))


d<-d%>%arrange(CaseNo,Submissiondate)


感谢您的指导。

我可以让您从
结束状态
数据框开始。我不确定它是否会快得多。因为
dplyr
一次对所有列执行操作(而不是按顺序向下),所以我仍然需要
while()
循环来完成缺少的几周中的一些填充。也许一个更好的
dplyr
人会提供另一种选择

library(dplyr)
library(tidyr)
cor_df2 <- EndStates %>%
  mutate(count = as.numeric(WeekEndState == "Correction")) %>% 
  select(-WeekEndState) %>% 
  pivot_wider(id_cols="week.end", names_from="CaseNo", values_from="count") %>% 
  arrange(week.end) %>%
  mutate(across(-week.end, function(x)case_when(is.na(x) & week.end == min(week.end) ~ 0, TRUE ~ x)))

while(any(is.na(cor_df2))){
  cor_df2 <- cor_df2 %>% mutate(across(-week.end, function(x)case_when(is.na(x)~lag(x), TRUE ~ x)))
}  
cor_df2 <- cor_df2 %>%   
  mutate(asw = rowSums(.[-1])) %>% 
  select(week.end, asw)
库(dplyr)
图书馆(tidyr)
cor_df2%
变异(计数=as.numeric(WeekEndState==“更正”)%>%
选择(-WeekEndState)%>%
透视图(id\u cols=“week.end”,name\u from=“CaseNo”,value\u from=“count”)%>%
安排(周末)%>%
变异(当(is.na(x)&week.end==min(week.end)~0,TRUE~x)时跨(-week.end,函数(x)case_)
而(任何(is.na(cor_df2))){
cor_df2%突变(在(-周末,函数(x)情况下,当(is.na(x)~lag(x),TRUE~x)))
}  
cor_df2%
变异(asw=行和(.-1]))%>%
选择(周末,asw)

非常感谢您的帮助!在我的5000箱df上测试过,看起来不错。我的550万美元!
week.end cor_df.asw
1 2020-08-22          0
2 2020-08-29          1
3 2020-09-05          2
4 2020-10-03          3
5 2020-10-17          2
6 2020-10-24          3
library(dplyr)
library(tidyr)
cor_df2 <- EndStates %>%
  mutate(count = as.numeric(WeekEndState == "Correction")) %>% 
  select(-WeekEndState) %>% 
  pivot_wider(id_cols="week.end", names_from="CaseNo", values_from="count") %>% 
  arrange(week.end) %>%
  mutate(across(-week.end, function(x)case_when(is.na(x) & week.end == min(week.end) ~ 0, TRUE ~ x)))

while(any(is.na(cor_df2))){
  cor_df2 <- cor_df2 %>% mutate(across(-week.end, function(x)case_when(is.na(x)~lag(x), TRUE ~ x)))
}  
cor_df2 <- cor_df2 %>%   
  mutate(asw = rowSums(.[-1])) %>% 
  select(week.end, asw)