R 如何制作一个逻辑语句，在两个日期列之间/中查找年份？_R_Date_If Statement

R 如何制作一个逻辑语句，在两个日期列之间/中查找年份？

r date if-statement

R 如何制作一个逻辑语句，在两个日期列之间/中查找年份？,r,date,if-statement,R,Date,If Statement,我在R中执行某些东西时遇到了问题，这可能并不难，但我就是想不出来假设我的数据框只有两个日期列：date_start和date_end df <- data.frame(date_started=as.Date(c("1990-02-01","1995-03-04","1997-04-01","1999-01-11","1993-04-04")), date_ended=as.Date(c("1993-08-12","1999-07-06","2000-06-05","1999-12-01

我在R中执行某些东西时遇到了问题，这可能并不难，但我就是想不出来

假设我的数据框只有两个日期列：date_start和date_end

df <- data.frame(date_started=as.Date(c("1990-02-01","1995-03-04","1997-04-01","1999-01-11","1993-04-04")),
date_ended=as.Date(c("1993-08-12","1999-07-06","2000-06-05","1999-12-01","1996-07-08")))

dfAbase R
选项将转换为Date
类，用格式
提取“年”，获得序列，stack
将向量
的列表
转换为2列data.frame，并用表格

lst1 <- Map(function(x, y) as.numeric(x):as.numeric(y),
  format(as.Date(df$date_started), "%Y"), format(as.Date(df$date_ended), "%Y"))
dfn <- cbind(df, as.data.frame.matrix( table(stack(lst1)[2:1])))
row.names(dfn) <- NULL
colnames(dfn)[-(1:2)] <- paste0("year_", colnames(dfn)[-(1:2)])
dfn
#  date_started date_ended year_1990 year_1991 year_1992 year_1993 year_1994 year_1995 year_1996 year_1997 year_1998 year_1999 year_2000
#1   1990-02-01 1993-08-12         1         1         1         1         0         0         0         0         0         0         0
#2   1995-03-04 1999-07-06         0         0         0         0         0         1         1         1         1         1         0
#3   1997-04-01 2000-06-05         0         0         0         0         0         0         0         1         1         1         1
#4   1999-01-11 1999-12-01         0         0         0         0         0         0         0         0         0         1         0
#5   1993-04-04 1996-07-08         0         0         0         1         1         1         1         0         0         0         0

base R
选项将转换为Date
类，使用格式提取“年”，获得序列，堆栈
将向量的列表
转换为2列data.frame，并使用表

lst1 <- Map(function(x, y) as.numeric(x):as.numeric(y),
  format(as.Date(df$date_started), "%Y"), format(as.Date(df$date_ended), "%Y"))
dfn <- cbind(df, as.data.frame.matrix( table(stack(lst1)[2:1])))
row.names(dfn) <- NULL
colnames(dfn)[-(1:2)] <- paste0("year_", colnames(dfn)[-(1:2)])
dfn
#  date_started date_ended year_1990 year_1991 year_1992 year_1993 year_1994 year_1995 year_1996 year_1997 year_1998 year_1999 year_2000
#1   1990-02-01 1993-08-12         1         1         1         1         0         0         0         0         0         0         0
#2   1995-03-04 1999-07-06         0         0         0         0         0         1         1         1         1         1         0
#3   1997-04-01 2000-06-05         0         0         0         0         0         0         0         1         1         1         1
#4   1999-01-11 1999-12-01         0         0         0         0         0         0         0         0         0         1         0
#5   1993-04-04 1996-07-08         0         0         0         1         1         1         1         0         0         0         0

一个dplyr
和tidyr
选项可以是：
df %>%
 rowwise() %>%
 mutate(var = list(seq(as.numeric(substr(date_started, 1, 4)), 
                       as.numeric(substr(date_ended, 1, 4)), 
                       1))) %>%
 ungroup() %>%
 unnest(var) %>%
 mutate(var = paste0("year_", var),
        val = 1) %>%
 pivot_wider(names_from = "var", values_from = "val", values_fill = list(val = 0))

  date_started date_ended year_1990 year_1991 year_1992 year_1993 year_1995 year_1996 year_1997 year_1998 year_1999
  <date>       <date>         <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1 1990-02-01   1993-08-12         1         1         1         1         0         0         0         0         0
2 1995-03-04   1999-07-06         0         0         0         0         1         1         1         1         1
3 1997-04-01   2000-06-05         0         0         0         0         0         0         1         1         1
4 1999-01-11   1999-12-01         0         0         0         0         0         0         0         0         1
5 1993-04-04   1996-07-08         0         0         0         1         1         1         0         0         0
# … with 2 more variables: year_2000 <dbl>, year_1994 <dbl>

df%>%
行（）
变异（变量=列表（序号为数字（substr（开始日期，1,4）），
作为数字（substr（结束日期，1,4）），
1))) %>%
解组（）%>%
未测试（var）%>%
变异（var=0（“年份”，var），
val=1）%>%
透视图（名称来自var、值来自val、值填充=列表（val=0））
日期开始日期结束年份1990年1991年1992年1993年1995年1996年1997年1998年1999年
1 1990-02-01   1993-08-12         1         1         1         1         0         0         0         0         0
2 1995-03-04   1999-07-06         0         0         0         0         1         1         1         1         1
3 1997-04-01   2000-06-05         0         0         0         0         0         0         1         1         1
4 1999-01-11   1999-12-01         0         0         0         0         0         0         0         0         1
5 1993-04-04   1996-07-08         0         0         0         1         1         1         0         0         0
#…还有两个变量：2000年，1994年
一个dplyr
和tidyr
选项可以是：
df %>%
 rowwise() %>%
 mutate(var = list(seq(as.numeric(substr(date_started, 1, 4)), 
                       as.numeric(substr(date_ended, 1, 4)), 
                       1))) %>%
 ungroup() %>%
 unnest(var) %>%
 mutate(var = paste0("year_", var),
        val = 1) %>%
 pivot_wider(names_from = "var", values_from = "val", values_fill = list(val = 0))

  date_started date_ended year_1990 year_1991 year_1992 year_1993 year_1995 year_1996 year_1997 year_1998 year_1999
  <date>       <date>         <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1 1990-02-01   1993-08-12         1         1         1         1         0         0         0         0         0
2 1995-03-04   1999-07-06         0         0         0         0         1         1         1         1         1
3 1997-04-01   2000-06-05         0         0         0         0         0         0         1         1         1
4 1999-01-11   1999-12-01         0         0         0         0         0         0         0         0         1
5 1993-04-04   1996-07-08         0         0         0         1         1         1         0         0         0
# … with 2 more variables: year_2000 <dbl>, year_1994 <dbl>

df%>%
行（）
变异（变量=列表（序号为数字（substr（开始日期，1,4）），
作为数字（substr（结束日期，1,4）），
1))) %>%
解组（）%>%
未测试（var）%>%
变异（var=0（“年份”，var），
val=1）%>%
透视图（名称来自var、值来自val、值填充=列表（val=0））
日期开始日期结束年份1990年1991年1992年1993年1995年1996年1997年1998年1999年
1 1990-02-01   1993-08-12         1         1         1         1         0         0         0         0         0
2 1995-03-04   1999-07-06         0         0         0         0         1         1         1         1         1
3 1997-04-01   2000-06-05         0         0         0         0         0         0         1         1         1
4 1999-01-11   1999-12-01         0         0         0         0         0         0         0         0         1
5 1993-04-04   1996-07-08         0         0         0         1         1         1         0         0         0
#…还有两个变量：2000年，1994年
将base R与lubridate:：year相结合，产生了一个简洁而简单的
解决方案：
year_bool <- sapply(1990:2000, function(y) {
    as.integer(y >= year(df$date_started) & y <= year(df$date_ended))
})
colnames(year_bool) <- paste('year', 1990:2000, sep = '_')

cbind(df, year_bool)

##   date_started date_ended year_1990 year_1991 year_1992 year_1993
## 1   1990-02-01 1993-08-12         1         1         1         1
## 2   1995-03-04 1999-07-06         0         0         0         0
## 3   1997-04-01 2000-06-05         0         0         0         0
## 4   1999-01-11 1999-12-01         0         0         0         0
## 5   1993-04-04 1996-07-08         0         0         0         1
##   year_1994 year_1995 year_1996 year_1997 year_1998 year_1999 year_2000
## 1         0         0         0         0         0         0         0
## 2         0         1         1         1         1         1         0
## 3         0         0         0         1         1         1         1
## 4         0         0         0         0         0         1         0
## 5         1         1         1         0         0         0         0

year\u bool=year（df$date\u start）&y将基数R与lubridate:：year相结合，得到一个简洁而简单的
解决方案：
year_bool <- sapply(1990:2000, function(y) {
    as.integer(y >= year(df$date_started) & y <= year(df$date_ended))
})
colnames(year_bool) <- paste('year', 1990:2000, sep = '_')

cbind(df, year_bool)

##   date_started date_ended year_1990 year_1991 year_1992 year_1993
## 1   1990-02-01 1993-08-12         1         1         1         1
## 2   1995-03-04 1999-07-06         0         0         0         0
## 3   1997-04-01 2000-06-05         0         0         0         0
## 4   1999-01-11 1999-12-01         0         0         0         0
## 5   1993-04-04 1996-07-08         0         0         0         1
##   year_1994 year_1995 year_1996 year_1997 year_1998 year_1999 year_2000
## 1         0         0         0         0         0         0         0
## 2         0         1         1         1         1         1         0
## 3         0         0         0         1         1         1         1
## 4         0         0         0         0         0         1         0
## 5         1         1         1         0         0         0         0

year\u bool=year（df$date\u start）&yBase R解决方案使用@Andy Rominger的逻辑：
# Create a vector with that's values are all the years between the two date vectors: 

year_range <- eval(parse(text = paste(range(unlist(lapply(df, 

                              function(x){x <- as.integer(gsub("[-].*", "", x))}))), 

                collapse = ":")))

# Using Andy Rominger's logic, but in base determine if date is between the two years: 

new_df <- cbind(df, setNames(data.frame(sapply(year_range, function(x){

    as.integer(x >= as.numeric(gsub("[-].*", "", df$date_started)) & 

                                 x <= as.numeric(gsub("[-].*", "", (df$date_ended))))

        }

      )

    ),

    c(paste0("year_", year_range))

  )

)

#创建一个向量，其值为两个日期向量之间的所有年份：
年份范围使用@Andy Rominger逻辑的基本R解决方案：
# Create a vector with that's values are all the years between the two date vectors: 

year_range <- eval(parse(text = paste(range(unlist(lapply(df, 

                              function(x){x <- as.integer(gsub("[-].*", "", x))}))), 

                collapse = ":")))

# Using Andy Rominger's logic, but in base determine if date is between the two years: 

new_df <- cbind(df, setNames(data.frame(sapply(year_range, function(x){

    as.integer(x >= as.numeric(gsub("[-].*", "", df$date_started)) & 

                                 x <= as.numeric(gsub("[-].*", "", (df$date_ended))))

        }

      )

    ),

    c(paste0("year_", year_range))

  )

)

#创建一个向量，其值为两个日期向量之间的所有年份：
年份范围Trydf%%>%mutate\u all（ymd）%%>%mutate（new=map2（year（date\u start）、year（date\u end），~seq（.x、.y）%%>%set\u name（str\u c（'year\u'））%%>%as.list））%%>%unest\u wide（new）%%mutate\u at（vars（以（'year'）、~+（！is.na（）
感谢您的回复。我没有运行此解决方案的正确软件包版本。但是，其他答案已经解决了我的问题。尝试df%>%mutate\u all（ymd）%%>%mutate（new=map2（year（date\u start），year（date\u ended），~seq（.x，.y）%%>%set\u names（str\u c（'year\u'，）%%>%as.list））%%unest\u wide（new）%%>%mutate\u at（vars（starts\u with（'year'），~+（！is.na（）
谢谢您的回复。我没有运行此解决方案的正确软件包版本。不过，其他答案已经解决了我的问题。非常感谢！第一个选项没有任何问题。第二个tidyverse选项不适用于my，因为软件包版本较旧。将不得不更新很多来检查。再次感谢你对我的帮助！谢谢！第一个选项没有任何问题。第二个tidyverse选项不适用于my，因为软件包版本较旧。将不得不更新很多来检查。再次感谢你对我的帮助！谢谢你帮助我！这个选项似乎也解决了我的问题。谢谢你的帮助！这个选项似乎也解决了我的问题。谢谢你的回复。我目前没有合适的Tidyr版本（对于函数pivot_更广泛）来运行此解决方案。下面的解决方案解决了问题，但没有出现此问题。是的，pivot\u wider（）
来自最新的tidyr
版本：）感谢您的回复。我目前没有合适的Tidyr版本（对于函数pivot_更广泛）来运行此解决方案。下面的解决方案解决了问题，但没有出现此问题。是的，pivot\u wider（）
来自最新的tidyr
版本：）