Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:一个数据的行数总和,基于另一个数据的行特定动态条件_R_Dynamic_Sum_Minimum_Rowwise - Fatal编程技术网

R:一个数据的行数总和,基于另一个数据的行特定动态条件

R:一个数据的行数总和,基于另一个数据的行特定动态条件,r,dynamic,sum,minimum,rowwise,R,Dynamic,Sum,Minimum,Rowwise,考虑以下数据: Country1 = c("Brazil", "India", "China","China","Brazil") Date1<-as.Date(c("2001-01-21", "2002-04-13","2003-06-19","2006-06-19","2007-06-19")) Name1<-c("B","C","A","A","A") Data1<-data.frame(Country1,Date1,Name1) Name2<-c("B","B"

考虑以下数据:

Country1 = c("Brazil", "India", "China","China","Brazil")
Date1<-as.Date(c("2001-01-21", "2002-04-13","2003-06-19","2006-06-19","2007-06-19"))
Name1<-c("B","C","A","A","A")
Data1<-data.frame(Country1,Date1,Name1)

Name2<-c("B","B","C","C","C","A","A","A")
Quality2<-c("good","good","medium","good","good","bad","good","good")
Country2<-c("China","Brazil","Taiwan","India","India","United States","China","Brazil")
Date2<-as.Date(c("2002-02-21", "1999-03-13","1998-08-19", "1996-09-13","2000-12-12","1998-07-21","2005-03-22","2003-06-19"))
Data2<-data.frame(Name2,Quality2,Country2,Date2)
Country1=c(“巴西”、“印度”、“中国”、“中国”、“巴西”)

Date1我们可以
过滤
质量2
以保留
良好的
行,将其与
数据1
国家2
分组,并计算
日期2
所在的行数和最小值

library(dplyr)

Data2 %>%
  filter(Quality2 == 'good') %>%
  right_join(Data1, by = c('Name2' = 'Name1', 'Country2' = 'Country1')) %>%
  group_by(Country2) %>%
  summarise(Result = sum(Date2 < Date1), 
            Date1 = min(Date2[Date2 < Date1]))

# A tibble: 3 x 3
#  Country2 Result Date1     
#  <chr>     <int> <date>    
#1 Brazil        1 1999-03-13
#2 China         0 NA        
#3 India         2 1996-09-13
库(dplyr)
数据2%>%
过滤器(质量2=='good')%>%
右键联接(数据1,by=c('Name2'='Name1','Country2'='Country1'))%>%
组别按(国家2)%>%
总结(结果=总和(日期2<日期1),
Date1=min(Date2[Date2

对于更新的数据,我们可以更改方法并执行以下操作:

Data1 %>%
  left_join(Data2, by = c('Name1' = 'Name2', 'Country1' = 'Country2')) %>%
  group_by(Country1, Date1) %>%
  summarise(Result = sum(Date2 < Date1 & Quality2 == "good"), 
            Date = min(Date2[Date2 < Date1 & Quality2 == "good"]))

#  Country1 Date1      Result Date      
#  <chr>    <date>      <int> <date>    
#1 Brazil   2001-01-21      1 1999-03-13
#2 China    2003-06-19      0 NA        
#3 China    2006-06-19      1 2005-03-22
#4 India    2002-04-13      2 1996-09-13
Data1%>%
左联接(数据2,by=c('Name1'='Name2','Country1'='Country2'))%>%
分组人(国家1,日期1)%>%
总结(结果=总和(日期2<日期1&质量2=“良好”),
Date=min(Date2[Date2
非常感谢您的快速回复。然而,在数据1中,可能存在具有相同“名称1”的多个观测值。在这种情况下,我不确定如何根据代码的输出添加结果列。编辑:我在主帖子的Data1中添加了第四行。很抱歉,我一直在编辑数据,使案例更加复杂。在实际数据中,有多个类别的Data1$Name1,它们具有相同的Data1$Country1。在编辑后的帖子中,我添加了第5行,其中Country1==“巴西”。如前所述,实际数据以千为单位。因此,在许多情况下,重复输入Date1、Country1或Name1。因此,Country1或Date1中的任何条目都不是特定于Name1中的特定条目。数据1的第5行就是一个例子(其中巴西是一个重复条目)。第二个问题保持不变。我们如何从您的代码输出中添加Data1$Result和Data1$Min.Date.Result?非常感谢您的耐心和帮助。@KashifAhmed我不太清楚此编辑在应用答案时有何不同。你试过答案了吗?它返回了什么?您可能想在
groupby
中添加另一个组
Name1
?就第二个问题而言,它已经出现在答案中,您需要像
Data3%left\u join(Data2,by=…。答案的其余部分
,在我的答案中,
Min.Date.Result
称为
Date
。检查
Data3
sum(Data2$Name2==as.character(Data1$Name1)[1] & Data2$Country2==as.character(Data1$Country1)[1] & ata2$Quality2=="good" & Data2$Date2 < Data1$Date1[1])
sum(Data2$Name2==as.character(Data1$Name1)[2] & Data2$Country2==as.character(Data1$Country1)[2] & ata2$Quality2=="good" & Data2$Date2 < Data1$Date1[2])
sum(Data2$Name2==as.character(Data1$Name1)[54342] & Data2$Country2==as.character(Data1$Country1)[54342] & ata2$Quality2=="good" & Data2$Date2 < Data1$Date1[54342])
sum(Data2$Name2==as.character(Data1$Name1)[n] & Data2$Country2==as.character(Data1$Country1)[n] & Data2$Quality2=="good" & Data2$Date2 < Data1$Date1[n])
library(dplyr)

Data2 %>%
  filter(Quality2 == 'good') %>%
  right_join(Data1, by = c('Name2' = 'Name1', 'Country2' = 'Country1')) %>%
  group_by(Country2) %>%
  summarise(Result = sum(Date2 < Date1), 
            Date1 = min(Date2[Date2 < Date1]))

# A tibble: 3 x 3
#  Country2 Result Date1     
#  <chr>     <int> <date>    
#1 Brazil        1 1999-03-13
#2 China         0 NA        
#3 India         2 1996-09-13
Data1 %>%
  left_join(Data2, by = c('Name1' = 'Name2', 'Country1' = 'Country2')) %>%
  group_by(Country1, Date1) %>%
  summarise(Result = sum(Date2 < Date1 & Quality2 == "good"), 
            Date = min(Date2[Date2 < Date1 & Quality2 == "good"]))

#  Country1 Date1      Result Date      
#  <chr>    <date>      <int> <date>    
#1 Brazil   2001-01-21      1 1999-03-13
#2 China    2003-06-19      0 NA        
#3 China    2006-06-19      1 2005-03-22
#4 India    2002-04-13      2 1996-09-13