R 使用带有日期的“小于”运算符
有一个数据框,包括三个日期度量,即date1、date2、date3和标记为s1和s2的附加度量。我正试图根据这些日期和测量值s1和s2创建标记为x1和x2的新列。例如,如果date1小于或等于date2,则列x1的值应为3,否则应保持s1的值。类似地,如果date1小于或等于date3,则列x2的值应为3,否则应保持s2的值。下面是数据的一部分R 使用带有日期的“小于”运算符,r,date,dataframe,if-statement,R,Date,Dataframe,If Statement,有一个数据框,包括三个日期度量,即date1、date2、date3和标记为s1和s2的附加度量。我正试图根据这些日期和测量值s1和s2创建标记为x1和x2的新列。例如,如果date1小于或等于date2,则列x1的值应为3,否则应保持s1的值。类似地,如果date1小于或等于date3,则列x2的值应为3,否则应保持s2的值。下面是数据的一部分 df <- structure( list( id = c(1L, 2L, 3L, 4L,5L), date1 = c("1/4/200
df <-
structure(
list(
id = c(1L, 2L, 3L, 4L,5L),
date1 = c("1/4/2004", "3/8/2004", "NA", "13/10/2004","11/3/2003"),
date2 = c("8/6/2002", "11/5/2004", "3/5/2004",
"25/11/2004","21/1/2004"),
s1=c(1,2,1,"NA","NA"),
date3=c("23/6/2006", "24/12/2006", "18/2/2006", "NA","NA"),
s2=c("NA","NA",2,"NA","NA")
),
.Names = c("id", "date1","date2","s1","date3","s2"),
class = "data.frame",
row.names = c(NA,-5L),
col_types = c("numeric", "date","date","numeric","date","numeric")
)
由此看来,x2列中的NA似乎不响应代码,因为2004年8月3日小于2006年12月24日,因此我预计x2列中的NA将由3代替。有谁能澄清为什么会发生这种情况以及如何解决。非常感谢您的帮助。日期列是数据中的字符类型
class(df$date1)
#[1] "character"
我们首先需要将它们转换为Date对象,然后进行比较
cols <- paste0("date", 1:3)
df[cols] <- lapply(df[cols], as.Date, "%d/%m/%Y")
df$x1<-ifelse(df$date1 <= df$date2, 3, df$s1)
df$x2<-ifelse(df$date1 <= df$date3, 3, df$s2)
df
# id date1 date2 s1 date3 s2 x1 x2
#1 1 2004-04-01 2002-06-08 1 2006-06-23 NA 1 3
#2 2 2004-08-03 2004-05-11 2 2006-12-24 NA 2 3
#3 3 <NA> 2004-05-03 1 2006-02-18 2 <NA> NA
#4 4 2004-10-13 2004-11-25 NA <NA> NA 3 NA
#5 5 2003-03-11 2004-01-21 NA <NA> NA 3 NA
class(df$date1)
#[1] "character"
cols <- paste0("date", 1:3)
df[cols] <- lapply(df[cols], as.Date, "%d/%m/%Y")
df$x1<-ifelse(df$date1 <= df$date2, 3, df$s1)
df$x2<-ifelse(df$date1 <= df$date3, 3, df$s2)
df
# id date1 date2 s1 date3 s2 x1 x2
#1 1 2004-04-01 2002-06-08 1 2006-06-23 NA 1 3
#2 2 2004-08-03 2004-05-11 2 2006-12-24 NA 2 3
#3 3 <NA> 2004-05-03 1 2006-02-18 2 <NA> NA
#4 4 2004-10-13 2004-11-25 NA <NA> NA 3 NA
#5 5 2003-03-11 2004-01-21 NA <NA> NA 3 NA
library(dplyr)
df %>%
mutate_at(vars(starts_with("date")), as.Date, "%d/%m/%Y") %>%
mutate(x1 = replace(s1, date1 <= date2, 3),
x2 = replace(s2, date1 <= date3, 3))