R 考虑到缺失值，如何计算15年内31天的日平均值？_R

R 考虑到缺失值，如何计算15年内31天的日平均值？

R 考虑到缺失值，如何计算15年内31天的日平均值？,r,R,被标记为重复。我不认为它是重复的，因为几年内以天为单位测量的时间跨度平均值以及丢失数据的情况没有在其他地方处理过。我已经做了一个答案，我不允许把它粘贴在原来的问题上。所以我把它贴在这里基于1993年至2008年15年的每日数据。如何根据以兴趣日为中心的31天窗口，计算文件中变量Open一年中每一天的日平均值。因此，15⨯31=465个日期有助于一天的统计数据 15年的产出只有365个值可从以下位置下载该文件：加载包和数据 library(lubridate) library(

被标记为重复。我不认为它是重复的，因为

几年内以天为单位测量的时间跨度平均值
以及丢失数据的情况

没有在其他地方处理过。我已经做了一个答案，我不允许把它粘贴在原来的问题上。所以我把它贴在这里

基于1993年至2008年15年的每日数据。如何根据以兴趣日为中心的31天窗口，计算文件中变量

Open

一年中每一天的日平均值。因此，

15⨯31=465个日期

有助于一天的统计数据

15年的产出只有365个值

可从以下位置下载该文件：

加载包和数据

library(lubridate)
library(dplyr)
dtf <- read.csv("http://chart.yahoo.com/table.csv?s=sbux&a=2&b=01&c=1993&d=2&e=01&f=2008&g=d&q=q&y=0&z=sbux&x=.csv", stringsAsFactors = FALSE)
# I prefer lower case column names
names(dtf) <- tolower(names(dtf))

向数据集中添加minus 15和plus 15日期，这将是计算给定年份中给定日期平均值的时间界限

dtf <- dtf %>% 
    mutate(date = ymd(date),
           minus15 = date - ddays(15),
           plus15 = date + ddays(15),
           monthday = substr(as.character(date),6,10),
           year = year(date),
           plotdate = ymd(paste(2008,monthday,sep="-"))) 

calendardays <- dtf %>% 
    select(monthday) %>% 
    distinct() %>%
    arrange(monthday)

看到这里出现周期性是不是很奇怪？

最后的图表示所有移动平均线的平均值？例如：对于2008-02-07的某个日期，您的平均天数为31天（之前15天，之后15天）。您对2007-02-07和2006-02-07执行相同的操作…以此类推，在所有的年日（2月7日）。然后平均所有这些值，得到02-07的最终值？这就是你在回答中所做的吗？在某种程度上是的，只是它不是平均值的平均值，而是直接超过15*31=465的平均值。共有15个31天的时间窗口：2008-02-07前15天和2008-02-07后15天，2007-02-07和2006-02-07也是如此……以此类推

meanday（“02-07”）

返回15个时间窗口内打开的

变量的平均值。我猜这就是最终值：>头（meandays）01-02 01-03 01-04 01-05 01-06 01-07 27.78185 29.25146 32.05867 34.79089 33.76463 32.12979
。是否有可能将其分为两列，分别命名为日期值最后一件事：我需要从这个平均值中减去所有原始值，例如，整个2月7日-这个值02-07
，整个2月8日-这个值02-08，等等。。。
dtf <- dtf %>% 
    mutate(date = ymd(date),
           minus15 = date - ddays(15),
           plus15 = date + ddays(15),
           monthday = substr(as.character(date),6,10),
           year = year(date),
           plotdate = ymd(paste(2008,monthday,sep="-"))) 

calendardays <- dtf %>% 
    select(monthday) %>% 
    distinct() %>%
    arrange(monthday) 

meanday <- function(givenday, dtf){
    # Extract the given day minus 15 days in all years available
    # Day minus 15 days will differ for example for march first 
    # in years where there is a february 29
    lowerbound <- dtf$minus15[dtf$monthday == givenday]
    # Produce the series of 31 days around the given day
    # that is the lower bound + 30 days
    filterdates <- lapply(lowerbound, function(x) x + ddays(0:30))
    filterdates <- Reduce(c, filterdates)
    # filter all of these days 
    dtfgivenday <- dtf %>%
        filter(date %in% filterdates) 
    return(mean(dtfgivenday$open))
}

meandays <- sapply(calendardays$monthday, meanday, dtf)
calendardays <- calendardays %>% 
    mutate(mean = meandays,
           plotdate = ymd(paste(2008,monthday,sep="-")))

plot(dtf$date,dtf$open,type="l")
library(ggplot2)
ggplot(dtf, aes(x=date,y=open, color = as.factor(year))) + geom_line()
ggplot(dtf, aes(x=plotdate,y=open, color = as.factor(year))) + geom_line()
ggplot(calendardays, aes(x=plotdate, y=mean)) + geom_line()