R中10分钟移动平均线至1小时移动平均线_R_Moving Average

R中10分钟移动平均线至1小时移动平均线

R中10分钟移动平均线至1小时移动平均线,r,moving-average,R,Moving Average,我有一组10分钟移动平均的天气数据，以1分钟的间隔显示。我想把它转换成平均1小时 Date Direction Speed 1 2017-07-06 00:01:00 93 7.3 2 2017-07-06 00:02:00 92 7.4 3 2017-07-06 00:03:00 92 7.3 4 2017-07-06 00:04:00 91 7.4 5 2017-07-06

我有一组10分钟移动平均的天气数据，以1分钟的间隔显示。我想把它转换成平均1小时

               Date   Direction   Speed
1  2017-07-06 00:01:00        93   7.3
2  2017-07-06 00:02:00        92   7.4
3  2017-07-06 00:03:00        92   7.3
4  2017-07-06 00:04:00        91   7.4
5  2017-07-06 00:05:00        91   7.3
6  2017-07-06 00:06:00        91   7.3
7  2017-07-06 00:07:00        91   7.2
8  2017-07-06 00:08:00        90   7.1
9  2017-07-06 00:09:00        90   6.9
10 2017-07-06 00:10:00        91   6.7
...
(thousands of row of data in 1 min-interval

*上述方向和速度为10分钟移动平均值

对于正常的移动平均内置函数，它们会遇到每个邻域值，如：

rollmean(timeLine$Speed, 60, fill=FALSE, align = "right")

在遇到n，n-1，n-2，n-3，n-59

然而，由于我的原始数据已经是10分钟的平均值，我只需要取n，n-10，n-20，n-30，n-40，n-50的值，以便将其转换为每小时的平均值

例如，如果我想要一个2001-07-06 10:00:00的小时数据，我只需要对以下各项取平均值：

2001-07-06 10:00:00
2001-07-06 09:50:00
2001-07-06 09:40:00
2001-07-06 09:30:00
2001-07-06 09:20:00
2001-07-06 09:10:00

有没有任何可能的方法可以让我在R上顺利地计算它

提前感谢您的帮助

更新1：这是dput（标题（时间线，10））

嗯，几乎可以肯定有一种更优雅的方式，但我认为这是可行的。我使用了

lubridate

软件包，使其易于转换为datetime格式：

library(tidyverse)
library(lubridate)

df = read.csv(text="
              Date,Time,Direction,Speed
              2001-07-04,09:01:00,310,4.0
              2001-07-04,09:02:00,310,3.9
              2001-07-04,09:03:00,310,3.9
              2001-07-04,09:04:00,310,3.9
              2001-07-04,09:05:00,300,3.9
              2001-07-04,09:06:00,300,4.0
              2001-07-04,09:07:00,300,3.9
              2001-07-04,09:08:00,300,4.0
              2001-07-04,09:09:00,300,4.0
              2001-07-04,09:10:00,300,4.0
              2001-07-04,09:11:00,290,4.0
              2001-07-04,09:12:00,290,4.0
              2001-07-04,09:13:00,290,4.0
              2001-07-04,09:14:00,290,4.0
              2001-07-04,09:15:00,290,4.0", sep=",", header = TRUE, row.names = NULL)

lagged_avg = function(col) {
  lag_positions = c(0,10,20,30,40,50)
  sum = 0
  for (n in lag_positions) {
    sum = sum + lag(col, n)
  }
  return(sum/6)
}

df = df %>%
  mutate(datetime = ymd_hms(paste0(Date," ",Time))) %>%
  mutate(lag = lagged_avg(Speed)) %>%
  select(-Date, -Time)

我想检查一下-具体来说，

collapse\u by（）

函数很有用。以下各项应可行（使用更多数据更容易测试）：

注意：根据您对工作时间的看法，您可能希望将

collapse\u by

行更改为

collapse\u by（“hour”，clean=TRUE，side=“start”）

——默认情况下，它将使用

side=“end”

解决方案是首先过滤

0、10、20、30、40、50分钟的数据。可以将日期/时间的minute
除以10
，并检查余数是否等于0，以过滤0、10、20、30、40、50分钟的数据。对每6次观察应用zoo:：rollmean
。这样，将使用第10、20、30、40、50和0分钟的数据计算每小时的平均值。最后过滤minute==0
（一小时）
数据：因为OP仅提供了10分钟的数据，不足以计算每小时平均值。因此，我将数据扩展到3小时：
timeLine <- structure(list(Date = structure(c(1499270460, 1499270520, 1499270580, 
1499270640, 1499270700, 1499270760, 1499270820, 1499270880, 1499270940, 1499271000), 
class = c("POSIXct", "POSIXt"), tzone = "Asia/Hong_Kong"), 
Direction = c(93L, 92L, 92L, 91L, 91L, 91L, 91L, 90L, 90L, 91L), 
Speed = c(7.3, 7.4, 7.3, 7.4, 7.3, 7.3, 7.2, 7.1, 6.9, 6.7)), 
.Names = c("Date", "Direction", "Speed"), row.names = c(NA, 10L), 
class = "data.frame")

#Extend data to cover 3 hours as
timeLine_mod <- timeLine %>% complete(Date = seq(min(Date),
         min(Date)+60*60*3-60,by="1 min"))

#Repeat the value of Direction and Speed
timeLine_mod$Direction <- timeLine$Direction
timeLine_mod$Speed <- timeLine$Speed

zoo中的时间线允许使用宽度=列表（偏移量向量）指定偏移量，如下所示：
transform(timeLine, avg = rollapplyr(Speed, list(seq(-50, 0, 10)), mean, fill = NA))

您应该发布dput（timeLine）
的输出，因为它是一个data.table对象；至少它打印出来的是一样的。发布print
表示使正确解析它成为一种PITA。即使是非常灵活的fread
函数，当您显然只有三个列时，也会提供5个列。很遗憾，POSIXt列的默认打印输出中有空格。谢谢您的建议。dput（head（timeLine，10））
的输出是structure（list（Date=structure）（c（1499270460，1499270520，1499270580，1499270640，1499270700，1499270760，1499270820，1499270880，14992709401271000），class=c（“POSIXct”，“POSIXt”），tzone=“亚洲/香港”），Direction=c（93L，92L，92L，91L，91L，91L，91L，91L），速度=c（7.3,7.4,7.3,7.4,7.3,7.3,7.2,7.1,6.9,6.7）），.Names=c（“日期”，“方向”，“速度”），row.Names=c（NA，10L），class=“data.frame”）
它将成为一个以小时为间隔的小时平均滚动，但不是1分钟interval@TLee我想你们已经提到，你们得到的数据已经是10分钟移动平均值，显示每分钟一次。也许这就是你们在问题的第一行中所写的。但若你们并没有，那个么首先计算10分钟移动平均线，然后是我提到的解。也许你可以在单管内完成。嵌套和简单！align=“right”
在您的语句中是否变成了伪语句？谢谢，rollappyr
末尾带有r
的使用align=“right”。
library(zoo)
library(lubridate)
library(tidyverse)

timeLine_mod %>% filter(minute(Date) %% 10 == 0) %>%
mutate(meanSpeed = rollmean(Speed, 6, fill = FALSE, align = "right")) %>%
filter(minute(Date) == 0)

#                  Date Direction Speed meanSpeed
# 1 2017-07-06 01:00:00        91   6.7       6.7
# 2 2017-07-06 02:00:00        91   6.7       6.7
# 3 2017-07-06 03:00:00        91   6.7       6.7

timeLine <- structure(list(Date = structure(c(1499270460, 1499270520, 1499270580, 
1499270640, 1499270700, 1499270760, 1499270820, 1499270880, 1499270940, 1499271000), 
class = c("POSIXct", "POSIXt"), tzone = "Asia/Hong_Kong"), 
Direction = c(93L, 92L, 92L, 91L, 91L, 91L, 91L, 90L, 90L, 91L), 
Speed = c(7.3, 7.4, 7.3, 7.4, 7.3, 7.3, 7.2, 7.1, 6.9, 6.7)), 
.Names = c("Date", "Direction", "Speed"), row.names = c(NA, 10L), 
class = "data.frame")

#Extend data to cover 3 hours as
timeLine_mod <- timeLine %>% complete(Date = seq(min(Date),
         min(Date)+60*60*3-60,by="1 min"))

#Repeat the value of Direction and Speed
timeLine_mod$Direction <- timeLine$Direction
timeLine_mod$Speed <- timeLine$Speed

transform(timeLine, avg = rollapplyr(Speed, list(seq(-50, 0, 10)), mean, fill = NA))