Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/date/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用Reforme2的dcast仅选择几个观察值(值)中的一个_R_Date_Reshape2_Dcast - Fatal编程技术网

如何使用Reforme2的dcast仅选择几个观察值(值)中的一个

如何使用Reforme2的dcast仅选择几个观察值(值)中的一个,r,date,reshape2,dcast,R,Date,Reshape2,Dcast,我有以下数据集 > dataset2 ID ATCcode date 1 1 N06AA 2001-01-01 2 1 N06AB 2001-04-01 3 1 N06AB 2001-03-01 4 1 N06AB 2001-02-01 5 1 N06AC 2001-01-01 6 2 N06AA 2001-01-01 7 2 N06AA 2001-02-01 8 2 N06AA 2001-03-01 9

我有以下数据集

> dataset2
   ID ATCcode       date
1   1   N06AA 2001-01-01
2   1   N06AB 2001-04-01
3   1   N06AB 2001-03-01
4   1   N06AB 2001-02-01
5   1   N06AC 2001-01-01
6   2   N06AA 2001-01-01
7   2   N06AA 2001-02-01
8   2   N06AA 2001-03-01
9   3   N06AB 2001-01-01
10  4   N06AA 2001-02-01
11  4   N06AB 2001-03-01
它是长格式的,我希望它是宽格式的。然而,我只想要每个ATCcode的最早日期,而不是任何较晚的日期。因此,我想在这里结束:

> datasetLong
  ID      N06AA      N06AB      N06AC
1  1 2001-01-01 2001-02-01 2001-01-01
2  2 2001-01-01       <NA>       <NA>
3  3       <NA> 2001-01-01       <NA>
4  4 2001-02-01 2001-03-01       <NA>
而不是长度,我只想要一个值,这个值应该是最小的值,或者是最早的日期

我发现有人问过一个类似的问题,但我无法以任何方式使用它而不会出现各种错误。我在上述尝试中没有使用melt,这可能是必要的吗? 感谢您的帮助。

此答案使用tidyverse方法

一种方法是从每个ID和ATCcode中选择最短日期,并将数据转换为宽格式

library(dplyr)

df %>%
  mutate(date = as.Date(date)) %>%
  group_by(ID, ATCcode) %>%
  slice(which.min(date)) %>%
  tidyr::pivot_wider(names_from = ATCcode, values_from = date)

#     ID N06AA      N06AB      N06AC     
#  <int> <date>     <date>     <date>    
#1     1 2001-01-01 2001-02-01 2001-01-01
#2     2 2001-01-01 NA         NA        
#3     3 NA         2001-01-01 NA        
#4     4 2001-02-01 2001-03-01 NA        
资料


你是在寻找答案吗?是的,作者建议使用tidyr。非常感谢,这正是我想要做的。
> dataset3
  ID N06AA N06AB N06AC
1  1     1     3     1
2  2     3     0     0
3  3     0     1     0
4  4     1     1     0
library(dplyr)

df %>%
  mutate(date = as.Date(date)) %>%
  group_by(ID, ATCcode) %>%
  slice(which.min(date)) %>%
  tidyr::pivot_wider(names_from = ATCcode, values_from = date)

#     ID N06AA      N06AB      N06AC     
#  <int> <date>     <date>     <date>    
#1     1 2001-01-01 2001-02-01 2001-01-01
#2     2 2001-01-01 NA         NA        
#3     3 NA         2001-01-01 NA        
#4     4 2001-02-01 2001-03-01 NA        
df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L, 
4L), ATCcode = structure(c(1L, 2L, 2L, 2L, 3L, 1L, 1L, 1L, 2L, 
1L, 2L), .Label = c("N06AA", "N06AB", "N06AC"), class = "factor"), 
date = structure(c(1L, 4L, 3L, 2L, 1L, 1L, 2L, 3L, 1L, 2L, 
3L), .Label = c("2001-01-01", "2001-02-01", "2001-03-01", 
"2001-04-01"), class = "factor")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))