如何根据日期和值从dataframe中选择行？_R_Dataframe

如何根据日期和值从dataframe中选择行？

r dataframe

如何根据日期和值从dataframe中选择行？,r,dataframe,R,Dataframe,我有一个数据框架，其中有许多国家以及它们的总病例和不同日期的新病例。情况如下： iso_code continent location date total_cases new_cases stringency_index population <chr> <chr> <chr> <chr> <dbl> <dbl> &

我有一个数据框架，其中有许多国家以及它们的总病例和不同日期的新病例。情况如下：

  iso_code continent     location date       total_cases new_cases stringency_index population
  <chr>    <chr>         <chr>    <chr>            <dbl>     <dbl>            <dbl>      <dbl>
1 ABW      North America Aruba    2020-03-13           2         2              0       106766
2 ABW      North America Aruba    2020-03-19          NA        NA             33.3     106766
3 ABW      North America Aruba    2020-03-20           4         2             33.3     106766
4 ABW      North America Aruba    2020-03-21          NA        NA             44.4     106766
5 ABW      North America Aruba    2020-03-22          NA        NA             44.4     106766
6 ABW      North America Aruba    2020-03-23          NA        NA             44.4     106766

我能够过滤数据帧，以获取新案例>=5的所有行：

df_filtered <- df %>% filter(new_cases >= 5)

但是，这给了我所有新的_案例等于或大于5的行：

  iso_code continent     location date       total_cases new_cases stringency_index population
  <chr>    <chr>         <chr>    <chr>            <dbl>     <dbl>            <dbl>      <dbl>
1 ABW      North America Aruba    2020-03-24          12         8             44.4     106766
2 ABW      North America Aruba    2020-03-25          17         5             44.4     106766
3 ABW      North America Aruba    2020-03-27          28         9             44.4     106766
4 ABW      North America Aruba    2020-03-30          50        22             85.2     106766
5 ABW      North America Aruba    2020-04-01          55         5             85.2     106766
6 ABW      North America Aruba    2020-04-03          60         5             85.2     106766

如何仅获取具有此条件的最早/第一个日期的行

这是我的输出理想情况下的样子：

  iso_code continent     location           date       total_cases new_cases stringency_index population
  <chr>    <chr>         <chr>              <chr>            <dbl>     <dbl>            <dbl>      <dbl>
1 ABW      North America Aruba              2020-03-24          12         8             44.4     106766
2 AFG      Asia          Afghanistan        2020-03-16          16         6             38.9     38928341
3 AGO      Africa        Angola             2020-04-19          24         5             90.7     32866268
4 ALB      Europe        Albania            2020-03-13          23        12             78.7     2877800
5 AND      Europe        Andorra            2020-03-17          14         9             31.4     77265
6 ARE      Asia          Utd. Arab Emirates 2020-02-28          19         6              8.3     9890400

试试这个：

df %>% 
  group_by(iso_code) %>%  ## within each country (group)
  filter(new_cases >= 5) %>%  ## keep rows where there are at least 5 cases
  slice_min(date, n = 1, with_ties = FALSE)  ## then keep the row with the smallest date

我让它与以下代码一起工作：

df_filtered <- df %>% filter(new_cases >= 5) #filter all new_cases with at least 5

df_sorted <- df_filtered %>%                 #group by country and arrange by date,
  group_by(iso_code) %>%                     #then get the first row of every 
  arrange(date) %>%                          #group 
  slice(1L)

受此问题答案的启发

查看slice\u max和slice\u min功能您可以尝试df\u filtered%filternew\u cases>=5&date==mindate@Duck-缺少组员，并且只有在该国第一天至少有5例病例时，该复合条件才为真，而不是条件为真的第一天。@Duck我试过了，但它只返回整个过程中的最低日期dataset@asd7由于未提供任何有意义的数据，请尝试此df%>%group\U BYCOUNTANCE%>%filternew\U cases>=5%>%filterdate==mindate或df%>%group\U BYCOUNTANCE%>%filterdate>=5%>%filterdate==firstdate