R 按日期加权平均数

R 按日期加权平均数,r,mean,weighted-average,R,Mean,Weighted Average,我有以下数据帧: df = data.frame(date = c("26/06/2013", "26/06/2013", "26/06/2013", "27/06/2013", "27/06/2013", "27/06/2013", "28/06/2013", "28/06/2013", "28/06/2013"), return = c(".51", ".32", ".34", ".39", "1.1", "3.2", "2.1", "5.3", "2.1"), cap = c("5

我有以下数据帧:

df = data.frame(date = c("26/06/2013", "26/06/2013",  "26/06/2013",  "27/06/2013", "27/06/2013", "27/06/2013", "28/06/2013", "28/06/2013",   "28/06/2013"), return = c(".51", ".32", ".34", ".39", "1.1", "3.2", "2.1", "5.3", "2.1"), cap = c("500", "235", "392", "213", "134", "144", "232", "155", "213"), weight = c("0.443655723", "0.20851819", "0.347826087", "0.433808554", "0.272912424", "0.293279022", "0.386666667", "0.258333333", "0.355"))
我想计算一下:

1) “权重”的最后一列。这是每天“cap”列的权重

2) 每天“回报”的加权“上限”平均值。我希望获得以下输出:

result = data.frame(date = c("26/06/2013", "27/06/2013", "28/06/2013"), cap.weight.mean = c("0.411251109", "1.407881874", "2.926666667"))

如有必要,首先将系数更改为数字

df$return=as.numeric(levels(df$return))[df$return]
df$cap=as.numeric(levels(df$cap))[df$cap]
df$weight=as.numeric(levels(df$weight))[df$weight]
问题1)

问题2)


下面是另一个使用
by
的选项

如前所述,转换为数字后

R> by(df, df$date, FUN = function(x) weighted.mean(x$return, w = x$weight) )
df$date: 26/06/2013
[1] 0.4112511
------------------------------------------------------------ 
df$date: 27/06/2013
[1] 1.407882
------------------------------------------------------------ 
df$date: 28/06/2013
[1] 2.926667
这将在
结果
data.frame中生成信息。我猜这就是你要找的

下面是另一个使用
memisc:::aggregate.formula的解决方案

> library(memisc)
> aggregate(weighted.mean(return, weight) ~ date, data = df)
>        date weighted.mean(return, weight)
1 26/06/2013                     0.4112511
4 27/06/2013                     1.4078819
7 28/06/2013                     2.9266667

使用plyr功能的另一种可能性:

library(plyr)
# Change factor to numeric
> df[,-1]<-sapply(df[,-1],function(x){as.numeric(as.character(x))})
> ddply(df,.(date),summarize,cap.weight.mean=weighted.mean(return,weight))
        date cap.weight.mean
1 26/06/2013       0.4112511
2 27/06/2013       1.4078819
3 28/06/2013       2.9266667
库(plyr)
#将系数更改为数字
>df[,-1]ddply(df,.(日期),汇总,上限重量平均值=加权平均值(回报,重量))
日期上限重量平均值
1 26/06/2013       0.4112511
2 27/06/2013       1.4078819
3 28/06/2013       2.9266667

您好,欢迎来到SO。你能详细说明你的问题吗。具体来说,“最后一列权重”是什么意思?
weight
不是
df
的最后一列。还有,你所说的“加权资本回报率均值”是什么意思?
> library(memisc)
> aggregate(weighted.mean(return, weight) ~ date, data = df)
>        date weighted.mean(return, weight)
1 26/06/2013                     0.4112511
4 27/06/2013                     1.4078819
7 28/06/2013                     2.9266667
library(plyr)
# Change factor to numeric
> df[,-1]<-sapply(df[,-1],function(x){as.numeric(as.character(x))})
> ddply(df,.(date),summarize,cap.weight.mean=weighted.mean(return,weight))
        date cap.weight.mean
1 26/06/2013       0.4112511
2 27/06/2013       1.4078819
3 28/06/2013       2.9266667