合并R中的聚合数据(再次)

合并R中的聚合数据(再次),r,merge,aggregate,zoo,R,Merge,Aggregate,Zoo,继我之前的研究之后,我有一个长期的每小时一次的数据,我希望以各种方式聚合数据。我希望根据一天中的小时数进行聚合,但也要根据聚合的组合进行聚合,例如,天类型为小时(即周日凌晨1点、周日凌晨2点等)。另一个例子是:周末或工作日每小时 下面的示例显示了我所做的两种聚合。我已经做到了。所以我最终得到了两个动物园的东西。我接下来要做的是将聚合合并到原始数据中,以便比较聚合的错误。这就是我目前所处的困境 请注意,我不使用中的解决方案,因为我希望聚合的灵活性 下面是一个片段,展示了我到目前为止所做的尝试。任何

继我之前的研究之后,我有一个长期的每小时一次的数据,我希望以各种方式聚合数据。我希望根据一天中的小时数进行聚合,但也要根据聚合的组合进行聚合,例如,天类型为小时(即周日凌晨1点、周日凌晨2点等)。另一个例子是:周末或工作日每小时

下面的示例显示了我所做的两种聚合。我已经做到了。所以我最终得到了两个动物园的东西。我接下来要做的是将聚合合并到原始数据中,以便比较聚合的错误。这就是我目前所处的困境

请注意,我不使用中的解决方案,因为我希望聚合的灵活性

下面是一个片段,展示了我到目前为止所做的尝试。任何帮助都将不胜感激

library(zoo)
Lines <- "Index,light.kw
2013-06-14 13:00:00,3.436
2013-06-14 13:15:00,3.327
2013-06-14 13:30:00,3.319
2013-06-14 13:45:00,3.308
2013-06-14 14:00:00,3.458
2013-06-14 14:15:00,3.452
2013-06-14 14:30:00,3.445
2013-06-14 14:45:00,3.469
2013-06-14 15:00:00,3.468
2013-06-14 15:15:00,3.427
2013-06-14 15:30:00,3.168
2013-06-14 15:45:00,2.383
2013-06-15 13:00:00,0.555
2013-06-15 13:15:00,0.555
2013-06-15 13:30:00,0.555
2013-06-15 13:45:00,0.555
2013-06-15 14:00:00,0.555
2013-06-15 14:15:00,0.555
2013-06-15 14:30:00,0.555
2013-06-15 14:45:00,0.719
2013-06-15 15:00:00,0.976
2013-06-15 15:15:00,0.981
2013-06-15 15:30:00,1.116
2013-06-15 15:45:00,0.59"
con <- textConnection(Lines)
z <- read.zoo(con, header=TRUE, sep=",",
     format="%Y-%m-%d %H:%M:%S", FUN=as.POSIXct)
close(con)

index.hourly = format(index(z), "%H")
z.hourly = aggregate(z, index.hourly, mean)
z.hourly
merge(z,z.hourly)

index.dayhour = format(index(z), "%w %H")
z.dayhour = aggregate(z, index.dayhour, mean)
z.dayhour
merge(z,z.dayhour)
图书馆(动物园)
根据以上建议,我找到了一个解决方案。请注意,按照的建议合并中间列在zoo中不起作用,因此此解决方案涉及将zoo对象转换回数据帧,并作为数据帧进行合并。这是:

library(zoo)
Lines <- "Index,light.kw
2013-06-14 13:00:00,3.436
2013-06-14 13:15:00,3.327
2013-06-14 13:30:00,3.319
2013-06-14 13:45:00,3.308
2013-06-14 14:00:00,3.458
2013-06-14 14:15:00,3.452
2013-06-14 14:30:00,3.445
2013-06-14 14:45:00,3.469
2013-06-14 15:00:00,3.468
2013-06-14 15:15:00,3.427
2013-06-14 15:30:00,3.168
2013-06-14 15:45:00,2.383
2013-06-15 13:00:00,0.555
2013-06-15 13:15:00,0.555
2013-06-15 13:30:00,0.555
2013-06-15 13:45:00,0.555
2013-06-15 14:00:00,0.555
2013-06-15 14:15:00,0.555
2013-06-15 14:30:00,0.555
2013-06-15 14:45:00,0.719
2013-06-15 15:00:00,0.976
2013-06-15 15:15:00,0.981
2013-06-15 15:30:00,1.116
2013-06-15 15:45:00,0.59"
con <- textConnection(Lines)
z <- read.zoo(con, header=TRUE, sep=",",
     format="%Y-%m-%d %H:%M:%S", FUN=as.POSIXct)
close(con)

# make the index for aggregation
index.hourly <- format(index(z), "%H")
# make the aggregate
z.hourly = aggregate(z, index.hourly, mean, na.rm=T)

# make a data frame from the original zoo,
# but the data frame must include the index.hourly
# so that later we can merge the data frame based
# on this index.
# First, make a zoo object of the index and then
# merge this with the original zoo.
z.index.hourly = zoo(index.hourly,index(z))
z.with.index = merge(z,z.index.hourly)
# make a dataframe of the last zoo
df1 = as.data.frame(z.with.index)
# add the index of the df1 (which is the timestamp) as a column
# as we will need the timestamp to rebuild the zoo object.
df1$Index = row.names(df1)

# make a dataframe of the aggregate zoo
df2 = as.data.frame(z.hourly)
df2$Index = row.names(df2)

# merge the two data frame
df3 = merge(df1,df2,by.x="z.index.hourly",by.y="Index",all.x=T)
df3 = df3[order(df3$Index),]
summary(df3)

# make a zoo object containing the original data and the aggregate
z.merged.agg = zoo(df3[,c(2,4)],as.POSIXct(df3$Index, tz="GMT"))
z.merged.agg
图书馆(动物园)

行通过与原始数据合并,您将如何控制聚合错误?您可能希望这样的代码正常工作,但我怀疑它可能需要构造一个中间列:
merge(z,z.hourly,by.x=format(index(z),“%H”)
merge(z,z.dayhour,by.x=format(index(z),%w%H))
@agstudy为了不“控制错误”,我只想在原始数据和聚合数据之间进行并行比较,这样我就可以使用聚合数据作为原始数据的预测值。@DWin谢谢,我根据这个建议开发了一个解决方案。