R中多个数的平均值_R_Aggregate_Average_Tapply

R中多个数的平均值

R中多个数的平均值,r,aggregate,average,tapply,R,Aggregate,Average,Tapply,假设我用这个代码生成一个数据 month<-c(rep(1,7),rep(2,7),rep(3,7)) date<-rep(c(rep(1,2),rep(2,3),rep(3,2)),3) value<-rnorm(21) df<-cbind(month,date,value)) 我如何计算一个月内给定日期的平均值所以在这种情况下，我希望我的输出像这样 month date avgvalue 1 1 -1.27589 1 2 -

假设我用这个代码生成一个数据

month<-c(rep(1,7),rep(2,7),rep(3,7))
date<-rep(c(rep(1,2),rep(2,3),rep(3,2)),3)
value<-rnorm(21)
df<-cbind(month,date,value))

我如何计算一个月内给定日期的平均值

所以在这种情况下，我希望我的输出像这样

month date   avgvalue
1      1     -1.27589
1      2     -0.267649
1      3     0.66798947
2      1     0.590321
 ...

非常感谢您的帮助：）

library（“plyr”）
df库（“plyr”）
df您可以使用聚合：
aggregate(df[,3], by=list(month=df[,1], date=df[,2]), mean)
#   month date          x
# 1     1    1  0.5661431
# 2     2    1  0.1843661
# 3     3    1  1.8339898
# 4     1    2  1.2053077
# 5     2    2 -0.2575551
# 6     3    2 -0.4464268
# 7     1    3 -0.7154689
# 8     2    3  0.7895702
# 9     3    3  0.4853081

您可以使用聚合
：
aggregate(df[,3], by=list(month=df[,1], date=df[,2]), mean)
#   month date          x
# 1     1    1  0.5661431
# 2     2    1  0.1843661
# 3     3    1  1.8339898
# 4     1    2  1.2053077
# 5     2    2 -0.2575551
# 6     3    2 -0.4464268
# 7     1    3 -0.7154689
# 8     2    3  0.7895702
# 9     3    3  0.4853081

您用tapply
标记了您的问题，因此这里有一个tapply
答案：
tapply(df[, "value"], INDEX=list(df[, "month"], df[, "date"]), FUN=mean)
#             1          2           3
# 1 -0.42965680  0.6943236  0.04505399
# 2  0.55021401 -0.3138895 -0.40966078
# 3  0.05676266  0.5212944  0.12521106

data.frame(as.table(
  tapply(df[, "value"], INDEX=list(df[, "month"], df[, "date"]), FUN=mean)))
#   Var1 Var2        Freq
# 1    1    1 -0.42965680
# 2    2    1  0.55021401
# 3    3    1  0.05676266
# 4    1    2  0.69432363
# 5    2    2 -0.31388954
# 6    3    2  0.52129439
# 7    1    3  0.04505399
# 8    2    3 -0.40966078
# 9    3    3  0.12521106

不过，更常见的方法是aggregate
（已提及）、plyr
（已提及）、data.table
和（最近）dplyr
。data.table
和dplyr
方法如下
library(data.table)
DT <- data.table(df)
DT[, mean(value), by = list(month, date)]


library(dplyr)
DF <- data.frame(df)
DF %.% group_by(month, date) %.% summarise(mean(value))

但是它们都把你带到了同一个地方。
你用tapply
标记了你的问题，所以这里有一个tapply
答案：
tapply(df[, "value"], INDEX=list(df[, "month"], df[, "date"]), FUN=mean)
#             1          2           3
# 1 -0.42965680  0.6943236  0.04505399
# 2  0.55021401 -0.3138895 -0.40966078
# 3  0.05676266  0.5212944  0.12521106

data.frame(as.table(
  tapply(df[, "value"], INDEX=list(df[, "month"], df[, "date"]), FUN=mean)))
#   Var1 Var2        Freq
# 1    1    1 -0.42965680
# 2    2    1  0.55021401
# 3    3    1  0.05676266
# 4    1    2  0.69432363
# 5    2    2 -0.31388954
# 6    3    2  0.52129439
# 7    1    3  0.04505399
# 8    2    3 -0.40966078
# 9    3    3  0.12521106

不过，更常见的方法是aggregate
（已提及）、plyr
（已提及）、data.table
和（最近）dplyr
。data.table
和dplyr
方法如下
library(data.table)
DT <- data.table(df)
DT[, mean(value), by = list(month, date)]


library(dplyr)
DF <- data.frame(df)
DF %.% group_by(month, date) %.% summarise(mean(value))

但是它们都把你带到了同一个地方。
公式方法在这里也很有效：aggregate（value~month+date，df，mean）
。公式方法在这里也很有效：aggregate（value~month+date，df，mean）
。如果你想提供一个可复制的样本，但又想使用随机数，在创建样本数据之前，最好也使用set.seed（）
。如果您要提供可复制的样本，但又想使用随机数，最好在创建样本数据之前也使用set.seed（）
。非常感谢！我喜欢这段代码比聚合代码运行得更快！非常感谢你！我喜欢这段代码比聚合代码运行得更快！非常感谢您的详细回答！我喜欢所有不同的选择，特别是因为我正在寻找最快的方法：）非常感谢您的详细回答！我喜欢所有不同的选择，特别是因为我正在寻找最快的方法：）