如何根据日期和值从dataframe中选择行?
我有一个数据框架,其中有许多国家以及它们的总病例和不同日期的新病例。情况如下:如何根据日期和值从dataframe中选择行?,r,dataframe,R,Dataframe,我有一个数据框架,其中有许多国家以及它们的总病例和不同日期的新病例。情况如下: iso_code continent location date total_cases new_cases stringency_index population <chr> <chr> <chr> <chr> <dbl> <dbl> &
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-13 2 2 0 106766
2 ABW North America Aruba 2020-03-19 NA NA 33.3 106766
3 ABW North America Aruba 2020-03-20 4 2 33.3 106766
4 ABW North America Aruba 2020-03-21 NA NA 44.4 106766
5 ABW North America Aruba 2020-03-22 NA NA 44.4 106766
6 ABW North America Aruba 2020-03-23 NA NA 44.4 106766
我能够过滤数据帧,以获取新案例>=5的所有行:
df_filtered <- df %>% filter(new_cases >= 5)
但是,这给了我所有新的_案例等于或大于5的行:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-24 12 8 44.4 106766
2 ABW North America Aruba 2020-03-25 17 5 44.4 106766
3 ABW North America Aruba 2020-03-27 28 9 44.4 106766
4 ABW North America Aruba 2020-03-30 50 22 85.2 106766
5 ABW North America Aruba 2020-04-01 55 5 85.2 106766
6 ABW North America Aruba 2020-04-03 60 5 85.2 106766
如何仅获取具有此条件的最早/第一个日期的行
这是我的输出理想情况下的样子:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-24 12 8 44.4 106766
2 AFG Asia Afghanistan 2020-03-16 16 6 38.9 38928341
3 AGO Africa Angola 2020-04-19 24 5 90.7 32866268
4 ALB Europe Albania 2020-03-13 23 12 78.7 2877800
5 AND Europe Andorra 2020-03-17 14 9 31.4 77265
6 ARE Asia Utd. Arab Emirates 2020-02-28 19 6 8.3 9890400
试试这个:
df %>%
group_by(iso_code) %>% ## within each country (group)
filter(new_cases >= 5) %>% ## keep rows where there are at least 5 cases
slice_min(date, n = 1, with_ties = FALSE) ## then keep the row with the smallest date
我让它与以下代码一起工作:
df_filtered <- df %>% filter(new_cases >= 5) #filter all new_cases with at least 5
df_sorted <- df_filtered %>% #group by country and arrange by date,
group_by(iso_code) %>% #then get the first row of every
arrange(date) %>% #group
slice(1L)
受此问题答案的启发查看slice\u max和slice\u min功能您可以尝试df\u filtered%filternew\u cases>=5&date==mindate@Duck-缺少组员,并且只有在该国第一天至少有5例病例时,该复合条件才为真,而不是条件为真的第一天。@Duck我试过了,但它只返回整个过程中的最低日期dataset@asd7由于未提供任何有意义的数据,请尝试此df%>%group\U BYCOUNTANCE%>%filternew\U cases>=5%>%filterdate==mindate或df%>%group\U BYCOUNTANCE%>%filterdate>=5%>%filterdate==firstdate