合并R中的聚合数据（再次）_R_Merge_Aggregate_Zoo

合并R中的聚合数据（再次）

r merge

合并R中的聚合数据（再次）,r,merge,aggregate,zoo,R,Merge,Aggregate,Zoo,继我之前的研究之后，我有一个长期的每小时一次的数据，我希望以各种方式聚合数据。我希望根据一天中的小时数进行聚合，但也要根据聚合的组合进行聚合，例如，天类型为小时（即周日凌晨1点、周日凌晨2点等）。另一个例子是：周末或工作日每小时下面的示例显示了我所做的两种聚合。我已经做到了。所以我最终得到了两个动物园的东西。我接下来要做的是将聚合合并到原始数据中，以便比较聚合的错误。这就是我目前所处的困境请注意，我不使用中的解决方案，因为我希望聚合的灵活性下面是一个片段，展示了我到目前为止所做的尝试。任何

继我之前的研究之后，我有一个长期的每小时一次的数据，我希望以各种方式聚合数据。我希望根据一天中的小时数进行聚合，但也要根据聚合的组合进行聚合，例如，天类型为小时（即周日凌晨1点、周日凌晨2点等）。另一个例子是：周末或工作日每小时

下面的示例显示了我所做的两种聚合。我已经做到了。所以我最终得到了两个动物园的东西。我接下来要做的是将聚合合并到原始数据中，以便比较聚合的错误。这就是我目前所处的困境

请注意，我不使用中的解决方案，因为我希望聚合的灵活性

下面是一个片段，展示了我到目前为止所做的尝试。任何帮助都将不胜感激

library(zoo)
Lines <- "Index,light.kw
2013-06-14 13:00:00,3.436
2013-06-14 13:15:00,3.327
2013-06-14 13:30:00,3.319
2013-06-14 13:45:00,3.308
2013-06-14 14:00:00,3.458
2013-06-14 14:15:00,3.452
2013-06-14 14:30:00,3.445
2013-06-14 14:45:00,3.469
2013-06-14 15:00:00,3.468
2013-06-14 15:15:00,3.427
2013-06-14 15:30:00,3.168
2013-06-14 15:45:00,2.383
2013-06-15 13:00:00,0.555
2013-06-15 13:15:00,0.555
2013-06-15 13:30:00,0.555
2013-06-15 13:45:00,0.555
2013-06-15 14:00:00,0.555
2013-06-15 14:15:00,0.555
2013-06-15 14:30:00,0.555
2013-06-15 14:45:00,0.719
2013-06-15 15:00:00,0.976
2013-06-15 15:15:00,0.981
2013-06-15 15:30:00,1.116
2013-06-15 15:45:00,0.59"
con <- textConnection(Lines)
z <- read.zoo(con, header=TRUE, sep=",",
     format="%Y-%m-%d %H:%M:%S", FUN=as.POSIXct)
close(con)

index.hourly = format(index(z), "%H")
z.hourly = aggregate(z, index.hourly, mean)
z.hourly
merge(z,z.hourly)

index.dayhour = format(index(z), "%w %H")
z.dayhour = aggregate(z, index.dayhour, mean)
z.dayhour
merge(z,z.dayhour)

图书馆（动物园）
根据以上建议，我找到了一个解决方案。请注意，按照的建议合并中间列在zoo中不起作用，因此此解决方案涉及将zoo对象转换回数据帧，并作为数据帧进行合并。这是：
library(zoo)
Lines <- "Index,light.kw
2013-06-14 13:00:00,3.436
2013-06-14 13:15:00,3.327
2013-06-14 13:30:00,3.319
2013-06-14 13:45:00,3.308
2013-06-14 14:00:00,3.458
2013-06-14 14:15:00,3.452
2013-06-14 14:30:00,3.445
2013-06-14 14:45:00,3.469
2013-06-14 15:00:00,3.468
2013-06-14 15:15:00,3.427
2013-06-14 15:30:00,3.168
2013-06-14 15:45:00,2.383
2013-06-15 13:00:00,0.555
2013-06-15 13:15:00,0.555
2013-06-15 13:30:00,0.555
2013-06-15 13:45:00,0.555
2013-06-15 14:00:00,0.555
2013-06-15 14:15:00,0.555
2013-06-15 14:30:00,0.555
2013-06-15 14:45:00,0.719
2013-06-15 15:00:00,0.976
2013-06-15 15:15:00,0.981
2013-06-15 15:30:00,1.116
2013-06-15 15:45:00,0.59"
con <- textConnection(Lines)
z <- read.zoo(con, header=TRUE, sep=",",
     format="%Y-%m-%d %H:%M:%S", FUN=as.POSIXct)
close(con)

# make the index for aggregation
index.hourly <- format(index(z), "%H")
# make the aggregate
z.hourly = aggregate(z, index.hourly, mean, na.rm=T)

# make a data frame from the original zoo,
# but the data frame must include the index.hourly
# so that later we can merge the data frame based
# on this index.
# First, make a zoo object of the index and then
# merge this with the original zoo.
z.index.hourly = zoo(index.hourly,index(z))
z.with.index = merge(z,z.index.hourly)
# make a dataframe of the last zoo
df1 = as.data.frame(z.with.index)
# add the index of the df1 (which is the timestamp) as a column
# as we will need the timestamp to rebuild the zoo object.
df1$Index = row.names(df1)

# make a dataframe of the aggregate zoo
df2 = as.data.frame(z.hourly)
df2$Index = row.names(df2)

# merge the two data frame
df3 = merge(df1,df2,by.x="z.index.hourly",by.y="Index",all.x=T)
df3 = df3[order(df3$Index),]
summary(df3)

# make a zoo object containing the original data and the aggregate
z.merged.agg = zoo(df3[,c(2,4)],as.POSIXct(df3$Index, tz="GMT"))
z.merged.agg

图书馆（动物园）
行通过与原始数据合并，您将如何控制聚合错误？您可能希望这样的代码正常工作，但我怀疑它可能需要构造一个中间列：merge（z，z.hourly，by.x=format（index（z），“%H”）
merge（z，z.dayhour，by.x=format（index（z），%w%H））
@agstudy为了不“控制错误”，我只想在原始数据和聚合数据之间进行并行比较，这样我就可以使用聚合数据作为原始数据的预测值。@DWin谢谢，我根据这个建议开发了一个解决方案。