Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 计算cummean()和cumsd(),同时忽略NA值并填充NAs_R_Dataframe_Apply - Fatal编程技术网

R 计算cummean()和cumsd(),同时忽略NA值并填充NAs

R 计算cummean()和cumsd(),同时忽略NA值并填充NAs,r,dataframe,apply,R,Dataframe,Apply,我的目标是获得一个数据帧的cum平均值和cumsd,同时忽略NA并用之前的cum平均值填充: df: 平均值: var1 var2 var3 x1/1 y1/1 z1/1 (x1+x2)/2 (y1+y2)/2 (z1+z2)/2 (x1+x2)/2 (y1+y2)/2 (z1+z2)/2 (x1+x2+x3)/3 (y1+y2+y3)/3 (z1+z2+z3)/3 因

我的目标是获得一个数据帧的cum平均值和cumsd,同时忽略NA并用之前的cum平均值填充:

df:

平均值:

var1           var2         var3   
 x1/1          y1/1          z1/1    
(x1+x2)/2     (y1+y2)/2     (z1+z2)/2
(x1+x2)/2     (y1+y2)/2     (z1+z2)/2
(x1+x2+x3)/3  (y1+y2+y3)/3  (z1+z2+z3)/3 
因此,对于第3行df有NA,我希望新矩阵包含分子上一行的cum平均值不应增加

到目前为止,我用这个来计算cum,意思是我知道某个地方一只小海豹被杀了,因为我使用了for循环,而不是apply家族的东西

for(i in names(df){
  df[i][!is.na(df[i])] <- GMCM:::cummean(df[i][!is.na(df[i])])
}
我也试过:

setDT(posRegimeReturns)    
cols<-colnames((posRegimeReturns))    
posRegimeReturns[, (cols) := lapply(.SD,  cummean) , .SD = cols]
但这两种方法都会让NAs空着

注:此问题与本帖类似 但与那里的解决方案不同,我不想离开NAs,而是用与上面不是NA的最后一行相同的值填充NAs。

您可能希望使用方差来计算此值

library(data.table)
dt <- data.table(V1=c(1,2,NA,3), V2=c(1,2,NA,3), V3=c(1,2,NA,3))

cols <- copy(names(dt))

#means
dt[ , paste0("mean_",cols) := lapply(.SD, function(x) {
    #get the num of non-NA observations
    lens <- cumsum(!is.na(x))

    #set NA to 0 before doing cumulative sum
    x[is.na(x)] <- 0
    cumsum(x) / lens
}), .SDcols=cols]

#sd
dt[ , paste0("sd_",cols) := lapply(.SD, function(x) {
    lens <- cumsum(!is.na(x))
    x[is.na(x)] <- 0

    #use defn of variance mean of sum of squares minus square of means and also n-1 in denominator
    sqrt(lens/(lens-1) * (cumsum(x^2)/lens - (cumsum(x) / lens)^2))
}), .SDcols=cols]
使用数据表。特别是:

 library(data.table)
 DT <- data.table(z = sample(N),idx=1:N,key="idx")

     z  idx
 1:  4   1
 2: 10   2
 3:  9   3
 4:  6   4
 5:  1   5
 6:  8   6
 7:  3   7
 8:  7   8
 9:  5   9  
10:  2  10
导致:

             z idx  cummean    cumsd
         1:  4   1 4.000000       NA
         2: 10   2 7.000000 4.242641
         3:  9   3 7.666667 3.214550
         4:  6   4 7.250000 2.753785
         5:  1   5 6.000000 3.674235
         6:  8   6 6.333333 3.386247
         7:  3   7 5.857143 3.338092
         8:  7   8 6.000000 3.116775
         9:  5   9 5.888889 2.934469
        10:  2  10 5.500000 3.027650
DT[,cummean:=sapply(seq(from=1,to=nrow(DT)) ,function(iii) mean(DT$z[1:iii],na.rm = TRUE))]
DT[,cumsd:=sapply(seq(from=1,to=nrow(DT)) ,function(iii) sd(DT$z[1:iii],na.rm = TRUE))]
             z idx  cummean    cumsd
         1:  4   1 4.000000       NA
         2: 10   2 7.000000 4.242641
         3:  9   3 7.666667 3.214550
         4:  6   4 7.250000 2.753785
         5:  1   5 6.000000 3.674235
         6:  8   6 6.333333 3.386247
         7:  3   7 5.857143 3.338092
         8:  7   8 6.000000 3.116775
         9:  5   9 5.888889 2.934469
        10:  2  10 5.500000 3.027650