Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用na汇总数据。rm=TRUE_R_Date_Dplyr_Na_Lubridate - Fatal编程技术网

使用na汇总数据。rm=TRUE

使用na汇总数据。rm=TRUE,r,date,dplyr,na,lubridate,R,Date,Dplyr,Na,Lubridate,考虑以下示例,该示例使用dplyr的summary管道汇总数据帧,以识别与某些CHAR相关的minimumDATE: library('tidyverse') library('lubridate') temp <- data.frame( CHAR = c( 'A', 'B', 'C' ), DATE = c( '20090101', '20100101', NA ) %>% ymd(), # Turn charac

考虑以下示例,该示例使用
dplyr
summary
管道汇总数据帧,以识别与某些
CHAR
相关的
min
imum
DATE

library('tidyverse')
library('lubridate')

temp <- data.frame(
  CHAR = c(
    'A',
    'B',
    'C'
  ),
  DATE = c(
    '20090101',
    '20100101',
    NA
  ) %>% ymd(), # Turn character strings to dates
  stringsAsFactors = FALSE
) %>% group_by(
  CHAR
) %>% summarise(
  DATE = min(DATE, na.rm = TRUE) # Extract minimum date
) %>% ungroup()
输出

# A tibble: 3 x 3
  CHAR  DATE       DATE_lgl
  <chr> <date>     <lgl>   
1 A     2009-01-01 FALSE   
2 B     2010-01-01 FALSE   
3 C     NA         FALSE   


问题是你正在评估

min(NA, na.rm=TRUE)
# Inf
对于第3行,这将导致

dput(temp$DATE[3])
# structure(Inf, class = "Date")
is.finite
添加到您的
mutate

temp %>% 
   mutate(DATE_lgl = is.finite(DATE) | is.na(DATE)  # Identify dates that are missing/NA)

 # A tibble: 3 x 3
 #   CHAR  DATE       DATE_lgl
 #  <chr> <date>     <lgl>   
 # 1 A     2009-01-01 TRUE    
 # 2 B     2010-01-01 TRUE    
 # 3 C     NA         FALSE

一种解决方法是将
Date
列转换为字符,然后评估它是否为
NA

temp %>% mutate(
  DATE_lgl = is.na(as.character(DATE))
)

# # A tibble: 3 x 3
#   CHAR  DATE       DATE_lgl
#   <chr> <date>     <lgl>   
# 1 A     2009-01-01 FALSE   
# 2 B     2010-01-01 FALSE   
# 3 C     NA         TRUE 
temp%>%mutate(
DATE_lgl=is.na(作为.character(日期))
)
##tibble:3 x 3
#字符日期
#            
#1 A 2009-01-01假
#2b 2010-01-01错误
#3 C不适用

dput(temp$DATE[3])
揭示了这个问题:
结构(Inf,class=“DATE”)
这可能是润滑油的问题吗?虽然我制定了一个解决方法,但我仍然不完全了解这个问题的原因。我试图将
ymd
函数替换为
as.Date
,问题是相同的,因此我认为这不是lubridate特定的问题。卡帕克的观点很好。可能是
date
类的某些限制,该类没有与numeric类的Inf关联的NA。然而,这只是我的猜测。谢谢分享这个有趣的问题。尽管如此,输出仍然指向
NA
。那么,课堂上的
NA
是否等同于
Inf
,因此不是真正的
NA
dput(temp$DATE[3])
# structure(Inf, class = "Date")
temp %>% 
   mutate(DATE_lgl = is.finite(DATE) | is.na(DATE)  # Identify dates that are missing/NA)

 # A tibble: 3 x 3
 #   CHAR  DATE       DATE_lgl
 #  <chr> <date>     <lgl>   
 # 1 A     2009-01-01 TRUE    
 # 2 B     2010-01-01 TRUE    
 # 3 C     NA         FALSE
as.Date(Inf, origin="1970-01-01")
# NA
dput(as.Date(Inf, origin="1970-01-01"))
# structure(Inf, class = "Date")
temp %>% mutate(
  DATE_lgl = is.na(as.character(DATE))
)

# # A tibble: 3 x 3
#   CHAR  DATE       DATE_lgl
#   <chr> <date>     <lgl>   
# 1 A     2009-01-01 FALSE   
# 2 B     2010-01-01 FALSE   
# 3 C     NA         TRUE