Datetime 在R中合并聚合数据
继我关于将小时数据聚合为每日数据的讨论之后,我想继续(a)每月聚合和(b)将每月聚合合并到原始数据帧中 我的原始数据帧如下所示:Datetime 在R中合并聚合数据,datetime,r,merge,aggregate,Datetime,R,Merge,Aggregate,继我关于将小时数据聚合为每日数据的讨论之后,我想继续(a)每月聚合和(b)将每月聚合合并到原始数据帧中 我的原始数据帧如下所示: Lines <- "Date,Outdoor,Indoor 01/01/2000 01:00,30,25 01/01/2000 02:00,31,26 01/01/2000 03:00,33,24 02/01/2000 01:00,29,25 02/01/2000 02:00,27,26 02/01/2000 03:00,39,24 12/01/2000 02:
Lines <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
Lines <- "Date,Month,OutdoorAVE
01/01/2000,Jan,31.33
02/01/2000,Feb,31.67
12/01/2000,Dec,31.33"
Lines <- "Date,Outdoor,Indoor,Month,OutdoorAVE
01/01/2000 01:00,30,25,Jan,31.33
01/01/2000 02:00,31,26,Jan,31.33
01/01/2000 03:00,33,24,Jan,31.33
02/01/2000 01:00,29,25,Feb,31.67
02/01/2000 02:00,27,26,Feb,31.67
02/01/2000 03:00,39,24,Feb,31.67
12/01/2000 02:00,27,26,Dec,31.33
12/01/2000 03:00,39,24,Dec,31.33
12/31/2000 23:00,28,25,Dec,31.33"
行尝试ave
和egPOSIXlt
提取月份:
zz <- textConnection(Lines)
Data <- read.table(zz,header=T,sep=",",stringsAsFactors=F)
close(zz)
Data$Month <- strftime(
as.POSIXlt(Data$Date,format="%m/%d/%Y %H:%M"),
format='%b')
Data$outdoor_ave <- ave(Data$Outdoor,Data$Month,FUN=mean)
编辑:然后只需在如上所示的数据中计算月份,并使用合并:
zz <- textConnection(Lines2) # Lines2 is the aggregated data
Data2 <- read.table(zz,header=T,sep=",",stringsAsFactors=F)
close(zz)
> merge(Data,Data2[-1],all=T)
Month Date Outdoor Indoor OutdoorAVE
1 Dec 12/01/2000 02:00 27 26 31.33
2 Dec 12/01/2000 03:00 39 24 31.33
3 Dec 12/31/2000 23:00 28 25 31.33
4 Feb 02/01/2000 01:00 29 25 31.67
5 Feb 02/01/2000 02:00 27 26 31.67
6 Feb 02/01/2000 03:00 39 24 31.67
7 Jan 01/01/2000 01:00 30 25 31.33
8 Jan 01/01/2000 02:00 31 26 31.33
9 Jan 01/01/2000 03:00 33 24 31.33
zz这里有一个zoo/xts解决方案。请注意,Month
在这里是数字的,因为您不能在zoo/xts对象中混合类型
require(xts) # loads zoo too
Lines1 <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
con <- textConnection(Lines1)
z <- read.zoo(con, header=TRUE, sep=",",
format="%m/%d/%Y %H:%M", FUN=as.POSIXct)
close(con)
zz <- merge(z, Month=.indexmon(z),
OutdoorAVE=ave(z[,1], .indexmon(z), FUN=mean))
zz
# Outdoor Indoor Month OutdoorAVE
# 2000-01-01 01:00:00 30 25 0 31.33333
# 2000-01-01 02:00:00 31 26 0 31.33333
# 2000-01-01 03:00:00 33 24 0 31.33333
# 2000-02-01 01:00:00 29 25 1 31.66667
# 2000-02-01 02:00:00 27 26 1 31.66667
# 2000-02-01 03:00:00 39 24 1 31.66667
# 2000-12-01 02:00:00 27 26 11 31.33333
# 2000-12-01 03:00:00 39 24 11 31.33333
# 2000-12-31 23:00:00 28 25 11 31.33333
require(xts)#
Lines1这与您的问题无关,但您可能希望使用RSQLite
和一个单独的表来代替各种聚合值,并使用简单的SQL命令连接这些表。如果你使用多种聚合,你的数据框很容易变大变丑。@Joris Meys:我的问题是outdoorAVE位于另一个数据框(比如data.Month)上,该数据框只有月份和平均列,一年内只有12行。户外距离不是根据您在上面写的那样计算的,而是根据年度每小时数据的总和(每天然后每月)计算的。因此,我想在原始数据框(即上面示例中的数据)中添加一列,该列取自另一个数据集(例如Data.Monthly)。@ery:在您的评论中,您说Data.Monthly
只有2列(月和平均值),但在您的原始问题中有3列。@Joshua:是的,应该是3列,虽然我只对将户外栏粘贴回原始数据框感兴趣。@Joris:是的,这很有效,谢谢。我的下一个问题显然是我的时间戳不是00-23,而是01-24。所以我每24行有一个NA条目。有什么好的解决方案吗?@ery:只需运行Data$Date@ery:请查看我的编辑。我不知道为什么一个月总是11。。。也许您的示例数据和实际数据不同?这是一个很好的解决方案,但我甚至不知道如何在SQLite中导入日期/时间,更不用说按月或日对其进行分组了。有什么帮助吗?您的整个问题都可以用SQL解决。我不是说您应该使用SQL进行聚合,但是了解它的工作原理是很有用的(相当简单)。我建议你读一读,用不同的思路提问。
require(xts) # loads zoo too
Lines1 <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
con <- textConnection(Lines1)
z <- read.zoo(con, header=TRUE, sep=",",
format="%m/%d/%Y %H:%M", FUN=as.POSIXct)
close(con)
zz <- merge(z, Month=.indexmon(z),
OutdoorAVE=ave(z[,1], .indexmon(z), FUN=mean))
zz
# Outdoor Indoor Month OutdoorAVE
# 2000-01-01 01:00:00 30 25 0 31.33333
# 2000-01-01 02:00:00 31 26 0 31.33333
# 2000-01-01 03:00:00 33 24 0 31.33333
# 2000-02-01 01:00:00 29 25 1 31.66667
# 2000-02-01 02:00:00 27 26 1 31.66667
# 2000-02-01 03:00:00 39 24 1 31.66667
# 2000-12-01 02:00:00 27 26 11 31.33333
# 2000-12-01 03:00:00 39 24 11 31.33333
# 2000-12-31 23:00:00 28 25 11 31.33333
Lines2 <- "Date,Month,OutdoorAVE
01/01/2000,Jan,31.33
02/01/2000,Feb,31.67
12/01/2000,Dec,31.33"
con <- textConnection(Lines2)
z2 <- read.zoo(con, header=TRUE, sep=",", format="%m/%d/%Y",
FUN=as.POSIXct, colClasses=c("character","NULL","numeric"))
close(con)
zz2 <- na.locf(merge(z1, Month=.indexmon(z1), OutdoorAVE=z2))[index(z1)]
# same output as zz (above)