Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何从数据帧中获取最低的行(或最近的日期)?_R_Dataframe - Fatal编程技术网

R 如何从数据帧中获取最低的行(或最近的日期)?

R 如何从数据帧中获取最低的行(或最近的日期)?,r,dataframe,R,Dataframe,我想为每个县(也称为最新数据,因为数据是按日期组织的)取最低的一行。我想删除所有旧数据。一些县的最新数据是在1月份,一些是在3月份(未显示)。我的df1大约有15000行长。如何做到这一点?输出应为: date county state cases deaths FIPS 1: 2020-01-21 Snohomish Washington 1 0 53061 2: 2020-01-22 Snohomish Washingt

我想为每个
(也称为最新数据,因为数据是按日期组织的)取最低的一行。我想删除所有旧数据。一些县的最新数据是在1月份,一些是在3月份(未显示)。我的df1大约有15000行长。如何做到这一点?输出应为:

          date      county      state cases deaths  FIPS
 1: 2020-01-21   Snohomish Washington     1      0 53061
 2: 2020-01-22   Snohomish Washington     3      0 53061
 3: 2020-01-23   Snohomish Washington     5      1 53061
 4: 2020-01-24        Cook   Illinois     1      0 17031
 5: 2020-01-24   Snohomish Washington     5      1 53061
 6: 2020-01-25      Orange California     1      0  6059
 7: 2020-01-25 Los Angeles   California   2      0  6037
 8: 2020-01-25   Snohomish Washington     5      2 53061
 9: 2020-01-26    Maricopa    Arizona     1      0  4013
10: 2020-01-26 Los Angeles California     17     0  6037
11: 2020-01-27    Maricopa    Arizona     3      1  4013
12: 2020-01-28       Cook    Illinois     2      2  17031

这里有一个带有
数据表的选项。按“县”、“州”分组,获取max“date”的索引以子集data.table

          date      county      state cases deaths  FIPS
 6: 2020-01-25      Orange California     1      0  6059
 8: 2020-01-25   Snohomish Washington     5      2 53061
10: 2020-01-26 Los Angeles California     17     0  6037
11: 2020-01-27    Maricopa    Arizona     3      1  4013
12: 2020-01-28       Cook    Illinois     2      2  17031
或者使用
.I

library(data.table)
setDT(df1)[, .SD[which.max(date)], .(county, state)]
#       county      state       date cases deaths  FIPS
#1:   Snohomish Washington 2020-01-25     5      2 53061
#2:        Cook   Illinois 2020-01-28     2      2 17031
#3:      Orange California 2020-01-25     1      0  6059
#4: Los Angeles California 2020-01-26    17      0  6037
#5:    Maricopa    Arizona 2020-01-27     3      1  4013

或使用
切片

setDT(df1)[df1[, .I[which.max(date)], .(county, state)]$V1]

或者,
按“县”、“州”和“日期”排列
,然后获得不同的

library(dplyr)
df1 %>%
    group_by(county, state) %>%
    slice(which.max(date))

或使用
base R

df1 %>%
     arrange(county, state, desc(date)) %>% 
     distinct(state, county, .keep_all = TRUE)
或者另一种选择是
订购
,然后使用
复制

subset(df1, ave(date, county, state, FUN = max) == date)
df2
df2 <- df1[with(df1, order(county, state, -date)),]
df2[!duplicated(df2[c('county', 'state)]),]
df1 <- structure(list(date = structure(c(18282, 18283, 18284, 18285, 
18285, 18286, 18286, 18286, 18287, 18287, 18288, 18289), class = "Date"), 
    county = c("Snohomish", "Snohomish", "Snohomish", "Cook", 
    "Snohomish", "Orange", "Los Angeles", "Snohomish", "Maricopa", 
    "Los Angeles", "Maricopa", "Cook"), state = c("Washington", 
    "Washington", "Washington", "Illinois", "Washington", "California", 
    "California", "Washington", "Arizona", "California", "Arizona", 
    "Illinois"), cases = c(1L, 3L, 5L, 1L, 5L, 1L, 2L, 5L, 1L, 
    17L, 3L, 2L), deaths = c(0L, 0L, 1L, 0L, 1L, 0L, 0L, 2L, 
    0L, 0L, 1L, 2L), FIPS = c(53061L, 53061L, 53061L, 17031L, 
    53061L, 6059L, 6037L, 53061L, 4013L, 6037L, 4013L, 17031L
    )), row.names = c("1:", "2:", "3:", "4:", "5:", "6:", "7:", 
"8:", "9:", "10:", "11:", "12:"), class = "data.frame")