Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/75.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R-从非恒定频率数据采集中获取小时平均值_R - Fatal编程技术网

R-从非恒定频率数据采集中获取小时平均值

R-从非恒定频率数据采集中获取小时平均值,r,R,我从一个流浪汉站收集了大量的气象数据,该站有自己的软件,但很难在那里进行后期处理。所以,我只是将所有信息添加到一个descent data.frame中,但现在我一直在阅读以了解如何获得每小时的结果。已经尝试过Plyr和lubridate软件包,但尚未成功。我是R领域的新手程序员,通常从互联网上找到的补丁构建代码 因此,我已经有了以下内容: data<-read.csv("file.txt",header=TRUE,sep=";",dec=".",stringsAsFactors=

我从一个流浪汉站收集了大量的气象数据,该站有自己的软件,但很难在那里进行后期处理。所以,我只是将所有信息添加到一个descent data.frame中,但现在我一直在阅读以了解如何获得每小时的结果。已经尝试过Plyr和lubridate软件包,但尚未成功。我是R领域的新手程序员,通常从互联网上找到的补丁构建代码

因此,我已经有了以下内容:

    data<-read.csv("file.txt",header=TRUE,sep=";",dec=".",stringsAsFactors=FALSE)
    data<-data[-1,]

    data$TIMETAMP <- strptime(data$TIMETAMP, format="%d-%m-%y %H:%M",         tz=Sys.timezone(location=TRUE))

    data$Vel_VIENTO<-as.numeric(as.character(data$Vel_VIENTO))
    data$Vel_RAFAGAS <-as.numeric(as.character(data$Vel_RAFAGAS))
    data$Temp_Amb <-as.numeric(as.character(data$Temp_Amb))

    data$HR <-as.numeric(as.character(data$HR))
    data$Temp_Agua <-as.numeric(as.character(data$Temp_Agua))
    data$Presion <-as.numeric(as.character(data$Presion))

    data$Radiacion <-as.numeric(as.character(data$Radiacion))
    data$Dir_VIENTO <-as.numeric(as.character(data$Dir_VIENTO))
      REGISTRO               FECHA Vel_VIENTO Vel_RAFAGAS Temp_Amb   HR Temp_Agua
    2        1 2015-01-08 15:03:00       6.30        7.55   20.579 58.5    23.472
    3        2 2015-01-08 15:18:00       6.55        9.07   20.412 57.5    22.609
    4        3 2015-01-08 15:33:00       6.80        8.56   21.413 54.7    23.761
    5        4 2015-01-08 15:48:00       6.30        8.31   20.222 59.5    22.705
    6        5 2015-01-08 16:03:00       6.55        8.31   20.246 58.6    22.298
    7        6 2015-01-08 16:18:00       7.30        9.57   19.008 63.5    21.366
      Presion Radiacion Dir_VIENTO
    2  906.55        NA         NA
    3  906.15        NA         NA
    4  905.95        NA         NA
    5  906.05        NA      202.2
    6  906.05     966.9      210.6
    7  905.75     919.4      227.4

因此,每列都有一个参数,在data$TIMESTAMP上有一个时间戳。data.frame内的数据频率范围为每15分钟到每30分钟。我想得到相同的表,其中包含data.frame中每个参数的每小时平均值

首先,如果您(将来)能够包含一段代码片段,那么我们将为您提供一个可复制的解决方案,这将是一件非常棒的事情

作为一种可能的解决方案,我建议查看dplyr包,为日期、时间和小时设置单独的列,最后按日期和小时对所有内容进行分组,以计算每小时平均值:

library(stringr)
library(dplyr)
FECHA <- c("2015-01-08 15:03:00", "2015-01-08 15:18:00","2015-01-08 15:33:00","2015-01-08 15:48:00","2015-01-08 16:03:00","2015-01-08 16:18:00")
Temp_Aqua <- c("23.472","22.609", "23.761","22.705", "22.298", "21.366")
date_time <- matrix(unlist(str_split(FECHA, " ")), ncol = 2, byrow = T)
x <- as.data.frame(cbind(date_time, Temp_Aqua), stringsAsFactors = F)
names(x) <- c("date", "time", "temp_aqua")
x$temp_aqua <- as.numeric(x$temp_aqua)
x$hour <- str_extract(x$time, "^[0-9]{2}")
x %>% group_by(date, hour) %>% summarise(hourly_temp_aqua = mean(temp_aqua))

Source: local data frame [2 x 3]
Groups: date [?]

        date  hour hourly_temp_aqua
       <chr> <chr>            <dbl>
1 2015-01-08    15         23.13675
2 2015-01-08    16         21.83200
库(stringr)
图书馆(dplyr)

FECHA首先,如果您(将来)能够包含一个代码片段,那么我们将为您提供一个可复制的解决方案,这将是一件非常棒的事情

作为一种可能的解决方案,我建议查看dplyr包,为日期、时间和小时设置单独的列,最后按日期和小时对所有内容进行分组,以计算每小时平均值:

library(stringr)
library(dplyr)
FECHA <- c("2015-01-08 15:03:00", "2015-01-08 15:18:00","2015-01-08 15:33:00","2015-01-08 15:48:00","2015-01-08 16:03:00","2015-01-08 16:18:00")
Temp_Aqua <- c("23.472","22.609", "23.761","22.705", "22.298", "21.366")
date_time <- matrix(unlist(str_split(FECHA, " ")), ncol = 2, byrow = T)
x <- as.data.frame(cbind(date_time, Temp_Aqua), stringsAsFactors = F)
names(x) <- c("date", "time", "temp_aqua")
x$temp_aqua <- as.numeric(x$temp_aqua)
x$hour <- str_extract(x$time, "^[0-9]{2}")
x %>% group_by(date, hour) %>% summarise(hourly_temp_aqua = mean(temp_aqua))

Source: local data frame [2 x 3]
Groups: date [?]

        date  hour hourly_temp_aqua
       <chr> <chr>            <dbl>
1 2015-01-08    15         23.13675
2 2015-01-08    16         21.83200
库(stringr)
图书馆(dplyr)

FECHA这是一个我在类似应用中使用的函数。需要注意的主要事项是使用trunc而不是round,并且需要将datetime转换为dplyr的POSIXct,trunc返回POSIXlt

library(lubridate)
library(dplyr)

hourly_ave <- function(timeseries_data){

  # Convert the "FECHA" column into datetime
  timeseries_data$FECHA <- mdy_hm(timeseries_data$FECHA) 

  # Add an Hourly column (use trunc instead of round)
  # Remember the "as.POSIXct() since trunc() returns POSIXlt which dplyr does not support
  timeseries_data$Hourly = trunc(timeseries_data$FECHA, "hours") %>% as.POSIXct()

  # Then group the data and summarize using dplyr
  # I did not include all the variables, but you should get the idea
  data_hr <- timeseries_data %>% 
              group_by(Hourly) %>%
              summarize(Vel_RAFAGAS = mean(Vel_RAFAGAS), Temp_Am = mean(Temp_Am), HR = mean(HR), Temp_Ag = mean(Temp_Ag))

  data_hr 
}
库(lubridate)
图书馆(dplyr)

这里有一个我在类似应用中使用过的函数。需要注意的主要事项是使用trunc而不是round,并且需要将datetime转换为dplyr的POSIXct,trunc返回POSIXlt

library(lubridate)
library(dplyr)

hourly_ave <- function(timeseries_data){

  # Convert the "FECHA" column into datetime
  timeseries_data$FECHA <- mdy_hm(timeseries_data$FECHA) 

  # Add an Hourly column (use trunc instead of round)
  # Remember the "as.POSIXct() since trunc() returns POSIXlt which dplyr does not support
  timeseries_data$Hourly = trunc(timeseries_data$FECHA, "hours") %>% as.POSIXct()

  # Then group the data and summarize using dplyr
  # I did not include all the variables, but you should get the idea
  data_hr <- timeseries_data %>% 
              group_by(Hourly) %>%
              summarize(Vel_RAFAGAS = mean(Vel_RAFAGAS), Temp_Am = mean(Temp_Am), HR = mean(HR), Temp_Ag = mean(Temp_Ag))

  data_hr 
}
库(lubridate)
图书馆(dplyr)
每小时