在R中合并输出

在R中合并输出,r,aggregate,R,Aggregate,样本数据: max=aggregate(cbind(a$VALUE,Date=a$DATE) ~ format(a$DATE, "%m") + cut(a$CLASS, breaks=c(0,2,4,6,8,10,12,14)) , data = a, max)[-1] max$DATE=as.Date(max$DATE, origin = "1970-01-01") 根据上表,仅第一个月我的预期产量为: DATE GRADE VALUE 2008-09-01 1

样本数据:

max=aggregate(cbind(a$VALUE,Date=a$DATE) ~ format(a$DATE, "%m") + cut(a$CLASS, breaks=c(0,2,4,6,8,10,12,14)) , data = a, max)[-1]
max$DATE=as.Date(max$DATE, origin = "1970-01-01")
根据上表,仅第一个月我的预期产量为:

DATE         GRADE    VALUE
2008-09-01     1        20
2008-09-02     2        30
2008-09-03     3        50
    .
    .
2008-09-30     2        75
    .
    .
2008-10-01     1        95
    .
    .
2008-11-01     4        90
    .
    . 
2008-12-01     1        70
2008-12-02     2        40
2008-12-28     4        30
2008-12-29     1        40
2008-12-31     3        50
在我的真实数据中输出:

 DATE         GRADE    VALUE
2008-09-30    (0,2]     75
2008-09-02    (2,4]     50
输出不符合样本数据,因为数据太大。一个简单的逻辑是有从1到10的等级,所以我想在相应的等级组中找到一个月的最高值。我需要每组的最大值(0,2),(0,4)等等

我使用了函数max的聚合条件,并用两列Date和Grade对其进行分组。现在,当我运行代码并显示max的值时,我得到了一个接一个的3个表作为输出。现在我想绘制这个输出,但由于这个原因,我无法这样做。那么,我如何合并所有这些输出?

尝试:

                format(DATE, "%m")
1                        09
2                        10
3                        11
4                        12
5                        09
6                        10
7                        11



  cut(a$GRADE, breaks = c(0, 2, 4, 6, 8, 10, 12, 14))        value
1                                                        (0,2] 0.30844444
2                                                        (0,2] 1.00000000
3                                                        (0,2] 1.00000000
4                                                        (0,2] 0.73333333
5                                                        (2,4] 0.16983488
6                                                        (2,4] 0.09368000
7                                                        (2,4] 0.10589335

          Date
1  2008-09-30
2  2008-10-31
3  2008-11-28
4  2008-12-31
5  2008-09-30
6  2008-10-31
7  2008-11-28
或使用
data.table

 library(dplyr)
 a %>%
 group_by(MONTH=format(DATE, "%m"), GRADE=cut(GRADE, breaks=seq(0,14,by=2))) %>%
 summarise_each(funs(max))

 #  MONTH GRADE       DATE VALUE
 #1    09 (0,2] 2008-09-30    75
 #2    09 (2,4] 2008-09-03    50
 #3    10 (0,2] 2008-10-01    95
 #4    11 (2,4] 2008-11-01    90
 #5    12 (0,2] 2008-12-29    70
 #6    12 (2,4] 2008-12-31    50
或者使用
聚合

 library(data.table)
  setDT(a)[, list(DATE=max(DATE), VALUE=max(VALUE)), 
                         by= list(MONTH=format(DATE, "%m"),
                     GRADE=cut(GRADE, breaks=seq(0,14, by=2)))]
  #       MONTH GRADE       DATE VALUE
  #1:    09 (0,2] 2008-09-30    75
  #2:    09 (2,4] 2008-09-03    50
  #3:    10 (0,2] 2008-10-01    95
  #4:    11 (2,4] 2008-11-01    90
  #5:    12 (0,2] 2008-12-29    70
  #6:    12 (2,4] 2008-12-31    50
数据
a以下使用base R的代码可能有用(使用akrun答案中的“a”数据帧):


您的意思是合并输出吗?@user3923765。您能显示您的预期输出吗?因为第二列名称太长,没有空间打印相邻的所有列。实际上只有一个“表”(即数据帧)。例如,如果执行
dim(max)
,您会得到什么?(作为旁注,
max
是一个基本的R函数,因此最好为输出选择另一个名称)我得到了14,2,而我应该得到14,3?而且我不能为不同的变量绘制值dates@user3923765.对于创建的示例数据,我每年和每月都会得到预期的输出。我还有5年的数据要处理,但是当我运行此代码时,它只会给我去年月份的值。例如,如果我输入2008年的数据-2013年。我在2013年的12个月内只得到12个值,而我希望得到60个值,每年12个。@user3923765。在这种情况下,您仅按
月和
年级进行分组,您可能还必须将
年也包括在分组中。可能按
格式(日期,“%Y”)进行分组
。未测试。无论我使用%Y%还是%M%@user3923765,结果都是一样的。您是否检查了我的新更新。顺便说一句,我没有更改2008年和2009年的
  res <- transform(with(a, 
           aggregate(cbind(VALUE, DATE), 
             list(MONTH=format(DATE, "%m") ,GRADE=cut(GRADE, breaks=seq(0,14, by=2))), max)),
           DATE=as.Date(DATE, origin="1970-01-01"))
   res[order(res$MONTH),]
  # MONTH GRADE VALUE       DATE
  #1    09 (0,2]    75 2008-09-30
  #4    09 (2,4]    50 2008-09-03
  #2    10 (0,2]    95 2008-10-01
  #5    11 (2,4]    90 2008-11-01
  #3    12 (0,2]    70 2008-12-29
  #6    12 (2,4]    50 2008-12-31
 a <-  structure(list(DATE = structure(c(14123, 14124, 14125, 14152, 
   14153, 14184, 14214, 14215, 14241, 14242, 14244), class = "Date"), 
   GRADE = c(1L, 2L, 3L, 2L, 1L, 4L, 1L, 2L, 4L, 1L, 3L), VALUE = c(20L, 
   30L, 50L, 75L, 95L, 90L, 70L, 40L, 30L, 40L, 50L)), .Names = c("DATE", 
  "GRADE", "VALUE"), row.names = c(NA, -11L), class = "data.frame")
   library(dplyr)
   a %>% 
   group_by(MONTH=format(DATE, "%m"), YEAR=format(DATE, "%Y"), GRADE=cut(GRADE, breaks=seq(0,14, by=2)))%>%
  summarise_each(funs(max))
  #   MONTH YEAR GRADE       DATE VALUE
  #1     09 2008 (0,2] 2008-09-30    75
  #2     09 2008 (2,4] 2008-09-03    50
  #3     09 2009 (0,2] 2009-09-30    75
  #4     09 2009 (2,4] 2009-09-03    50
  #5     10 2008 (0,2] 2008-10-01    95
  #6     10 2009 (0,2] 2009-10-01    95
  #7     11 2008 (2,4] 2008-11-01    90
  #8     11 2009 (2,4] 2009-11-01    90
  #9     12 2008 (0,2] 2008-12-29    70
  #10    12 2008 (2,4] 2008-12-31    50
  #11    12 2009 (0,2] 2009-12-29    70
  #12    12 2009 (2,4] 2009-12-31    50
 a <- structure(list(DATE = structure(c(14123, 14124, 14125, 14152, 
   14153, 14184, 14214, 14215, 14241, 14242, 14244, 14488, 14489, 
  14490, 14517, 14518, 14549, 14579, 14580, 14606, 14607, 14609
  ), class = "Date"), GRADE = c(1L, 2L, 3L, 2L, 1L, 4L, 1L, 2L, 
  4L, 1L, 3L, 1L, 2L, 3L, 2L, 1L, 4L, 1L, 2L, 4L, 1L, 3L), VALUE = c(20L, 
  30L, 50L, 75L, 95L, 90L, 70L, 40L, 30L, 40L, 50L, 20L, 30L, 50L, 
  75L, 95L, 90L, 70L, 40L, 30L, 40L, 50L)), .Names = c("DATE", 
  "GRADE", "VALUE"), row.names = c("1", "2", "3", "4", "5", "6", 
  "7", "8", "9", "10", "11", "12", "21", "31", "41", "51", "61", 
   "71", "81", "91", "101", "111"), class = "data.frame")
xx = strsplit(as.character(a$DATE), '-')
a$month = sapply(strsplit(as.character(a$DATE), '-'),'[',2)
gradeCats = cut(a$GRADE, breaks = c(0, 2, 4, 6, 8, 10, 12, 14))

aggregate(VALUE~month+gradeCats, data= a, max)
  month gradeCats VALUE
1    09     (0,2]    75
2    10     (0,2]    95
3    12     (0,2]    70
4    09     (2,4]    50
5    11     (2,4]    90
6    12     (2,4]    50