Datetime 在R中合并聚合数据

Datetime 在R中合并聚合数据,datetime,r,merge,aggregate,Datetime,R,Merge,Aggregate,继我关于将小时数据聚合为每日数据的讨论之后,我想继续(a)每月聚合和(b)将每月聚合合并到原始数据帧中 我的原始数据帧如下所示: Lines <- "Date,Outdoor,Indoor 01/01/2000 01:00,30,25 01/01/2000 02:00,31,26 01/01/2000 03:00,33,24 02/01/2000 01:00,29,25 02/01/2000 02:00,27,26 02/01/2000 03:00,39,24 12/01/2000 02:

继我关于将小时数据聚合为每日数据的讨论之后,我想继续(a)每月聚合和(b)将每月聚合合并到原始数据帧中

我的原始数据帧如下所示:

Lines <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
Lines <- "Date,Month,OutdoorAVE
01/01/2000,Jan,31.33
02/01/2000,Feb,31.67
12/01/2000,Dec,31.33"
Lines <- "Date,Outdoor,Indoor,Month,OutdoorAVE
01/01/2000 01:00,30,25,Jan,31.33
01/01/2000 02:00,31,26,Jan,31.33
01/01/2000 03:00,33,24,Jan,31.33
02/01/2000 01:00,29,25,Feb,31.67
02/01/2000 02:00,27,26,Feb,31.67
02/01/2000 03:00,39,24,Feb,31.67
12/01/2000 02:00,27,26,Dec,31.33
12/01/2000 03:00,39,24,Dec,31.33
12/31/2000 23:00,28,25,Dec,31.33"

行尝试
ave
和eg
POSIXlt
提取月份:

zz <- textConnection(Lines)
Data <- read.table(zz,header=T,sep=",",stringsAsFactors=F)
close(zz)

Data$Month <- strftime(
     as.POSIXlt(Data$Date,format="%m/%d/%Y %H:%M"),
     format='%b')
Data$outdoor_ave <- ave(Data$Outdoor,Data$Month,FUN=mean)

编辑:然后只需在如上所示的数据中计算月份,并使用合并:

zz <- textConnection(Lines2) # Lines2 is the aggregated data
Data2 <- read.table(zz,header=T,sep=",",stringsAsFactors=F)
close(zz)

> merge(Data,Data2[-1],all=T)
  Month             Date Outdoor Indoor OutdoorAVE
1   Dec 12/01/2000 02:00      27     26      31.33
2   Dec 12/01/2000 03:00      39     24      31.33
3   Dec 12/31/2000 23:00      28     25      31.33
4   Feb 02/01/2000 01:00      29     25      31.67
5   Feb 02/01/2000 02:00      27     26      31.67
6   Feb 02/01/2000 03:00      39     24      31.67
7   Jan 01/01/2000 01:00      30     25      31.33
8   Jan 01/01/2000 02:00      31     26      31.33
9   Jan 01/01/2000 03:00      33     24      31.33

zz这里有一个zoo/xts解决方案。请注意,
Month
在这里是数字的,因为您不能在zoo/xts对象中混合类型

require(xts) # loads zoo too
Lines1 <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
con <- textConnection(Lines1)
z <- read.zoo(con, header=TRUE, sep=",",
    format="%m/%d/%Y %H:%M", FUN=as.POSIXct)
close(con)

zz <- merge(z, Month=.indexmon(z),
    OutdoorAVE=ave(z[,1], .indexmon(z), FUN=mean))
zz
#                     Outdoor Indoor Month OutdoorAVE
# 2000-01-01 01:00:00      30     25     0   31.33333
# 2000-01-01 02:00:00      31     26     0   31.33333
# 2000-01-01 03:00:00      33     24     0   31.33333
# 2000-02-01 01:00:00      29     25     1   31.66667
# 2000-02-01 02:00:00      27     26     1   31.66667
# 2000-02-01 03:00:00      39     24     1   31.66667
# 2000-12-01 02:00:00      27     26    11   31.33333
# 2000-12-01 03:00:00      39     24    11   31.33333
# 2000-12-31 23:00:00      28     25    11   31.33333
require(xts)#

Lines1这与您的问题无关,但您可能希望使用
RSQLite
和一个单独的表来代替各种聚合值,并使用简单的SQL命令连接这些表。如果你使用多种聚合,你的数据框很容易变大变丑。

@Joris Meys:我的问题是outdoorAVE位于另一个数据框(比如data.Month)上,该数据框只有月份和平均列,一年内只有12行。户外距离不是根据您在上面写的那样计算的,而是根据年度每小时数据的总和(每天然后每月)计算的。因此,我想在原始数据框(即上面示例中的数据)中添加一列,该列取自另一个数据集(例如Data.Monthly)。@ery:在您的评论中,您说
Data.Monthly
只有2列(月和平均值),但在您的原始问题中有3列。@Joshua:是的,应该是3列,虽然我只对将户外栏粘贴回原始数据框感兴趣。@Joris:是的,这很有效,谢谢。我的下一个问题显然是我的时间戳不是00-23,而是01-24。所以我每24行有一个NA条目。有什么好的解决方案吗?@ery:只需运行
Data$Date@ery:请查看我的编辑。我不知道为什么一个月总是11。。。也许您的示例数据和实际数据不同?这是一个很好的解决方案,但我甚至不知道如何在SQLite中导入日期/时间,更不用说按月或日对其进行分组了。有什么帮助吗?您的整个问题都可以用SQL解决。我不是说您应该使用SQL进行聚合,但是了解它的工作原理是很有用的(相当简单)。我建议你读一读,用不同的思路提问。
require(xts) # loads zoo too
Lines1 <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
con <- textConnection(Lines1)
z <- read.zoo(con, header=TRUE, sep=",",
    format="%m/%d/%Y %H:%M", FUN=as.POSIXct)
close(con)

zz <- merge(z, Month=.indexmon(z),
    OutdoorAVE=ave(z[,1], .indexmon(z), FUN=mean))
zz
#                     Outdoor Indoor Month OutdoorAVE
# 2000-01-01 01:00:00      30     25     0   31.33333
# 2000-01-01 02:00:00      31     26     0   31.33333
# 2000-01-01 03:00:00      33     24     0   31.33333
# 2000-02-01 01:00:00      29     25     1   31.66667
# 2000-02-01 02:00:00      27     26     1   31.66667
# 2000-02-01 03:00:00      39     24     1   31.66667
# 2000-12-01 02:00:00      27     26    11   31.33333
# 2000-12-01 03:00:00      39     24    11   31.33333
# 2000-12-31 23:00:00      28     25    11   31.33333
Lines2 <- "Date,Month,OutdoorAVE
01/01/2000,Jan,31.33
02/01/2000,Feb,31.67
12/01/2000,Dec,31.33"
con <- textConnection(Lines2)
z2 <- read.zoo(con, header=TRUE, sep=",", format="%m/%d/%Y",
    FUN=as.POSIXct, colClasses=c("character","NULL","numeric"))
close(con)

zz2 <- na.locf(merge(z1, Month=.indexmon(z1), OutdoorAVE=z2))[index(z1)]
# same output as zz (above)