如何使用dplyr从所有给定的每日观测时间序列中总结年平均温度？_R_Dplyr_Data Manipulation

如何使用dplyr从所有给定的每日观测时间序列中总结年平均温度？

如何使用dplyr从所有给定的每日观测时间序列中总结年平均温度？,r,dplyr,data-manipulation,R,Dplyr,Data Manipulation,我想知道dplyr是否提供了一些有用的实用程序来对地表温度时间序列进行快速数据聚合。但是，我已经从E-OBSdataset（）中提取了德国的网格数据，并以excel格式以表格数据呈现了提取的光栅网格。现在，在新导出的数据中，数据显示了15年温度观测的相应地理坐标对（1012行，15x365/366列）。请查看动态数据：这是我想做的，动态的数据，我想按年度进行数据聚合，因为原始观测是通过每日水平观测完成的。特别是，每一对地理坐标，我打算计算每年的平均温度，所有操作持续15年。更具体地说，在聚合完

我想知道

dplyr

是否提供了一些有用的实用程序来对地表温度时间序列进行快速数据聚合。但是，我已经从

E-OBS

dataset（）中提取了德国的网格数据，并以

excel

格式以表格数据呈现了提取的光栅网格。现在，在新导出的数据中，数据显示了15年温度观测的相应地理坐标对（1012行，15x365/366列）。请查看动态数据：

这是我想做的，动态的数据，我想按年度进行数据聚合，因为原始观测是通过每日水平观测完成的。特别是，每一对地理坐标，我打算计算每年的平均温度，所有操作持续15年。更具体地说，在聚合完成后，我想将结果放入新的data.frame中，其中包含原始地理坐标对，但添加新列，如

1980\u avg\u temp

，

1981\u avg\u temp，

1982\u avg\u temp`等。因此，我想按列减少数据维度，引入新的聚合列，其中将添加年平均温度

如何使用

dplyr

或

data.table

for

excel

数据来完成此操作？有没有更简单的方法可以对动态附加的数据执行此数据聚合操作？有什么想法吗？

我试过：

library(tidyverse)
library(readxl)
df <- read_excel("YOUR_XLSX_FILE")

df %>% 
  gather(date, temp, -x, -y) %>% 
  separate(date, c("year", "month", "day")) %>% 
  separate(year, c("trash", "year"), sep = "X") %>% 
  select(-trash) %>% 
  group_by(year, x, y) %>% 
  summarise(avg_temp=mean(temp)) %>% 
  spread(year, avg_temp)

输出为：

# A tibble: 19 x 17
# Groups: x [11]
       x     y `1980` `1981` `1982` `1983` `1984` `1985` `1986` `1987` `1988` `1989` `1990` `1991`
 * <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
 1  8.88  54.4   7.79   8.02   8.76   9.20   8.32   7.51   7.88   7.43   9.20   9.63   9.76   8.55
 2  8.88  54.9   7.54   7.61   8.41   8.84   8.15   7.15   7.53   7.15   8.97   9.51   9.55   8.42
 3  9.12  54.4   7.65   7.86   8.62   9.05   8.17   7.34   7.70   7.28   9.01   9.46   9.60   8.37
 4  9.12  54.6   7.44   7.59   8.38   8.81   8.02   7.11   7.50   7.13   8.88   9.36   9.47   8.31
 5  9.12  54.9   7.33   7.36   8.25   8.67   8.02   7.05   7.49   7.10   8.91   9.48   9.55   8.41
 6  9.38  54.4   7.69   7.91   8.61   9.02   8.15   7.31   7.69   7.24   8.98   9.49   9.64   8.35
 7  9.38  54.6   7.45   7.62   8.46   8.85   8.05   7.16   7.59   7.18   8.92   9.48   9.61   8.41
 8  9.38  54.9   7.24   7.29   8.21   8.62   7.95   7.04   7.56   7.15   8.94   9.57   9.66   8.53
 9  9.62  54.4   7.65   7.90   8.60   9.01   8.14   7.24   7.64   7.16   8.93   9.52   9.65   8.33
10  9.62  54.6   7.39   7.60   8.45   8.82   8.01   7.10   7.56   7.12   8.86   9.46   9.55   8.34
11  9.62  54.9   7.28   7.38   8.28   8.69   7.98   7.07   7.61   7.18   8.96   9.60   9.68   8.54
12  9.88  54.4   7.70   8.00   8.69   9.14   8.23   7.36   7.76   7.23   9.03   9.63   9.73   8.41
13  9.88  54.6   7.40   7.65   8.46   8.87   8.05   7.11   7.58   7.12   8.87   9.47   9.50   8.30
14 10.1   54.4   7.76   8.12   8.78   9.21   8.30   7.49   7.90   7.34   9.08   9.69   9.79   8.52
15 10.4   54.4   7.66   8.09   8.70   9.17   8.23   7.41   7.87   7.29   9.03   9.70   9.82   8.60
16 11.1   54.9   7.61   8.14   8.74   9.14   8.33   7.32   7.92   7.22   9.17   9.93  10.1    8.86
17 11.4   54.9   7.59   8.17   8.74   9.14   8.32   7.29   7.92   7.20   9.17   9.95  10.1    8.87
18 11.9   54.9   7.54   8.15   8.71   9.10   8.28   7.19   7.85   7.15   9.10   9.92  10.1    8.84
19 12.1   54.9   7.52   8.12   8.69   9.08   8.27   7.12   7.80   7.11   9.05   9.91  10.0    8.82
# ... with 3 more variables: `1992` <dbl>, `1993` <dbl>, `1994` <dbl>

#       x      y     1980     1981     1982     1983     1984     1985     1986     1987     1988
# 1 8.875 54.375 7.792978 8.021342 8.762274 9.203424 8.317131 7.505370 7.879068 7.427260 9.197431
# 2 8.875 54.875 7.536229 7.607507 8.414877 8.841260 8.154945 7.151890 7.532164 7.147945 8.969781
# 3 9.125 54.375 7.651393 7.862466 8.620904 9.052630 8.169262 7.337589 7.701205 7.282657 9.014590
# 4 9.125 54.625 7.435983 7.590548 8.381753 8.808904 8.019399 7.109096 7.499589 7.127370 8.875656
# 5 9.125 54.875 7.332978 7.363370 8.247205 8.669370 8.024645 7.045425 7.487424 7.098849 8.911776
# 6 9.375 54.375 7.693907 7.914630 8.612438 9.022055 8.150164 7.305068 7.688164 7.242274 8.984207
#       1989     1990     1991     1992     1993     1994
# 1 9.625781 9.760931 8.550356 9.678907 8.208109 9.390904
# 2 9.513863 9.552767 8.420109 9.425328 8.010082 9.134466
# 3 9.462959 9.602876 8.374575 9.465164 8.052794 9.207041
# 4 9.358986 9.473178 8.305863 9.353743 7.935507 9.050109
# 5 9.478192 9.545781 8.412329 9.403005 7.998877 9.074740
# 6 9.493205 9.635561 8.352740 9.385819 8.017260 9.184959

我试过：

library(tidyverse)
library(readxl)
df <- read_excel("YOUR_XLSX_FILE")

df %>% 
  gather(date, temp, -x, -y) %>% 
  separate(date, c("year", "month", "day")) %>% 
  separate(year, c("trash", "year"), sep = "X") %>% 
  select(-trash) %>% 
  group_by(year, x, y) %>% 
  summarise(avg_temp=mean(temp)) %>% 
  spread(year, avg_temp)

输出为：

# A tibble: 19 x 17
# Groups: x [11]
       x     y `1980` `1981` `1982` `1983` `1984` `1985` `1986` `1987` `1988` `1989` `1990` `1991`
 * <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
 1  8.88  54.4   7.79   8.02   8.76   9.20   8.32   7.51   7.88   7.43   9.20   9.63   9.76   8.55
 2  8.88  54.9   7.54   7.61   8.41   8.84   8.15   7.15   7.53   7.15   8.97   9.51   9.55   8.42
 3  9.12  54.4   7.65   7.86   8.62   9.05   8.17   7.34   7.70   7.28   9.01   9.46   9.60   8.37
 4  9.12  54.6   7.44   7.59   8.38   8.81   8.02   7.11   7.50   7.13   8.88   9.36   9.47   8.31
 5  9.12  54.9   7.33   7.36   8.25   8.67   8.02   7.05   7.49   7.10   8.91   9.48   9.55   8.41
 6  9.38  54.4   7.69   7.91   8.61   9.02   8.15   7.31   7.69   7.24   8.98   9.49   9.64   8.35
 7  9.38  54.6   7.45   7.62   8.46   8.85   8.05   7.16   7.59   7.18   8.92   9.48   9.61   8.41
 8  9.38  54.9   7.24   7.29   8.21   8.62   7.95   7.04   7.56   7.15   8.94   9.57   9.66   8.53
 9  9.62  54.4   7.65   7.90   8.60   9.01   8.14   7.24   7.64   7.16   8.93   9.52   9.65   8.33
10  9.62  54.6   7.39   7.60   8.45   8.82   8.01   7.10   7.56   7.12   8.86   9.46   9.55   8.34
11  9.62  54.9   7.28   7.38   8.28   8.69   7.98   7.07   7.61   7.18   8.96   9.60   9.68   8.54
12  9.88  54.4   7.70   8.00   8.69   9.14   8.23   7.36   7.76   7.23   9.03   9.63   9.73   8.41
13  9.88  54.6   7.40   7.65   8.46   8.87   8.05   7.11   7.58   7.12   8.87   9.47   9.50   8.30
14 10.1   54.4   7.76   8.12   8.78   9.21   8.30   7.49   7.90   7.34   9.08   9.69   9.79   8.52
15 10.4   54.4   7.66   8.09   8.70   9.17   8.23   7.41   7.87   7.29   9.03   9.70   9.82   8.60
16 11.1   54.9   7.61   8.14   8.74   9.14   8.33   7.32   7.92   7.22   9.17   9.93  10.1    8.86
17 11.4   54.9   7.59   8.17   8.74   9.14   8.32   7.29   7.92   7.20   9.17   9.95  10.1    8.87
18 11.9   54.9   7.54   8.15   8.71   9.10   8.28   7.19   7.85   7.15   9.10   9.92  10.1    8.84
19 12.1   54.9   7.52   8.12   8.69   9.08   8.27   7.12   7.80   7.11   9.05   9.91  10.0    8.82
# ... with 3 more variables: `1992` <dbl>, `1993` <dbl>, `1994` <dbl>

#       x      y     1980     1981     1982     1983     1984     1985     1986     1987     1988
# 1 8.875 54.375 7.792978 8.021342 8.762274 9.203424 8.317131 7.505370 7.879068 7.427260 9.197431
# 2 8.875 54.875 7.536229 7.607507 8.414877 8.841260 8.154945 7.151890 7.532164 7.147945 8.969781
# 3 9.125 54.375 7.651393 7.862466 8.620904 9.052630 8.169262 7.337589 7.701205 7.282657 9.014590
# 4 9.125 54.625 7.435983 7.590548 8.381753 8.808904 8.019399 7.109096 7.499589 7.127370 8.875656
# 5 9.125 54.875 7.332978 7.363370 8.247205 8.669370 8.024645 7.045425 7.487424 7.098849 8.911776
# 6 9.375 54.375 7.693907 7.914630 8.612438 9.022055 8.150164 7.305068 7.688164 7.242274 8.984207
#       1989     1990     1991     1992     1993     1994
# 1 9.625781 9.760931 8.550356 9.678907 8.208109 9.390904
# 2 9.513863 9.552767 8.420109 9.425328 8.010082 9.134466
# 3 9.462959 9.602876 8.374575 9.465164 8.052794 9.207041
# 4 9.358986 9.473178 8.305863 9.353743 7.935507 9.050109
# 5 9.478192 9.545781 8.412329 9.403005 7.998877 9.074740
# 6 9.493205 9.635561 8.352740 9.385819 8.017260 9.184959

这对您提供的数据有效

library(tidyverse)
library(lubridate)
demo_data %>%
  gather(date, temp, -x, -y) %>%
  mutate(date = ymd(str_remove(date, "X"))) %>%
  mutate(year = year(date)) %>%
  group_by(x, y, year) %>%
  summarise_at(vars(temp), mean, na.rm = TRUE) %>%
  spread(year, temp)

# # A tibble: 19 x 17
# # Groups:   x, y [19]
#        x     y `1980` `1981` `1982` `1983` `1984` `1985` `1986` `1987` `1988`
#    <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
#  1  8.88  54.4   7.79   8.02   8.76   9.20   8.32   7.51   7.88   7.43   9.20
#  2  8.88  54.9   7.54   7.61   8.41   8.84   8.15   7.15   7.53   7.15   8.97
#  3  9.12  54.4   7.65   7.86   8.62   9.05   8.17   7.34   7.70   7.28   9.01
#  4  9.12  54.6   7.44   7.59   8.38   8.81   8.02   7.11   7.50   7.13   8.88
#  5  9.12  54.9   7.33   7.36   8.25   8.67   8.02   7.05   7.49   7.10   8.91
#  6  9.38  54.4   7.69   7.91   8.61   9.02   8.15   7.31   7.69   7.24   8.98
#  7  9.38  54.6   7.45   7.62   8.46   8.85   8.05   7.16   7.59   7.18   8.92
#  8  9.38  54.9   7.24   7.29   8.21   8.62   7.95   7.04   7.56   7.15   8.94
#  9  9.62  54.4   7.65   7.90   8.60   9.01   8.14   7.24   7.64   7.16   8.93
# 10  9.62  54.6   7.39   7.60   8.45   8.82   8.01   7.10   7.56   7.12   8.86
# 11  9.62  54.9   7.28   7.38   8.28   8.69   7.98   7.07   7.61   7.18   8.96
# 12  9.88  54.4   7.70   8.00   8.69   9.14   8.23   7.36   7.76   7.23   9.03
# 13  9.88  54.6   7.40   7.65   8.46   8.87   8.05   7.11   7.58   7.12   8.87
# 14 10.1   54.4   7.76   8.12   8.78   9.21   8.30   7.49   7.90   7.34   9.08
# 15 10.4   54.4   7.66   8.09   8.70   9.17   8.23   7.41   7.87   7.29   9.03
# 16 11.1   54.9   7.61   8.14   8.74   9.14   8.33   7.32   7.92   7.22   9.17
# 17 11.4   54.9   7.59   8.17   8.74   9.14   8.32   7.29   7.92   7.20   9.17
# 18 11.9   54.9   7.54   8.15   8.71   9.10   8.28   7.19   7.85   7.15   9.10
# 19 12.1   54.9   7.52   8.12   8.69   9.08   8.27   7.12   7.80   7.11   9.05
# # ... with 6 more variables: `1989` <dbl>, `1990` <dbl>, `1991` <dbl>,
# #   `1992` <dbl>, `1993` <dbl>, `1994` <dbl>

库（tidyverse）
图书馆（lubridate）
演示\u数据%>%
聚集（日期，温度，-x，-y）%>%
突变（日期=ymd（str_remove（日期，“X”）））%>%
突变（年=年（日））%>%
按（x，y，年份）划分的组别百分比>
总结（变量（温度）、平均值、na.rm=真实值）%>%
价差（年份、温度）
##A tibble:19 x 17
##组：x，y[19]
#x y`1980``1981``1982``1983``1984``1985``1986``1987``1988`
#                       
#  1  8.88  54.4   7.79   8.02   8.76   9.20   8.32   7.51   7.88   7.43   9.20
#  2  8.88  54.9   7.54   7.61   8.41   8.84   8.15   7.15   7.53   7.15   8.97
#  3  9.12  54.4   7.65   7.86   8.62   9.05   8.17   7.34   7.70   7.28   9.01
#  4  9.12  54.6   7.44   7.59   8.38   8.81   8.02   7.11   7.50   7.13   8.88
#  5  9.12  54.9   7.33   7.36   8.25   8.67   8.02   7.05   7.49   7.10   8.91
#  6  9.38  54.4   7.69   7.91   8.61   9.02   8.15   7.31   7.69   7.24   8.98
#  7  9.38  54.6   7.45   7.62   8.46   8.85   8.05   7.16   7.59   7.18   8.92
#  8  9.38  54.9   7.24   7.29   8.21   8.62   7.95   7.04   7.56   7.15   8.94
#  9  9.62  54.4   7.65   7.90   8.60   9.01   8.14   7.24   7.64   7.16   8.93
# 10  9.62  54.6   7.39   7.60   8.45   8.82   8.01   7.10   7.56   7.12   8.86
# 11  9.62  54.9   7.28   7.38   8.28   8.69   7.98   7.07   7.61   7.18   8.96
# 12  9.88  54.4   7.70   8.00   8.69   9.14   8.23   7.36   7.76   7.23   9.03
# 13  9.88  54.6   7.40   7.65   8.46   8.87   8.05   7.11   7.58   7.12   8.87
# 14 10.1   54.4   7.76   8.12   8.78   9.21   8.30   7.49   7.90   7.34   9.08
# 15 10.4   54.4   7.66   8.09   8.70   9.17   8.23   7.41   7.87   7.29   9.03
# 16 11.1   54.9   7.61   8.14   8.74   9.14   8.33   7.32   7.92   7.22   9.17
# 17 11.4   54.9   7.59   8.17   8.74   9.14   8.32   7.29   7.92   7.20   9.17
# 18 11.9   54.9   7.54   8.15   8.71   9.10   8.28   7.19   7.85   7.15   9.10
# 19 12.1   54.9   7.52   8.12   8.69   9.08   8.27   7.12   7.80   7.11   9.05
# # ... 还有6个变量：`1989`、`1990`、`1991`，
# #   `1992` , `1993` , `1994`

这对您提供的数据有效

library(tidyverse)
library(lubridate)
demo_data %>%
  gather(date, temp, -x, -y) %>%
  mutate(date = ymd(str_remove(date, "X"))) %>%
  mutate(year = year(date)) %>%
  group_by(x, y, year) %>%
  summarise_at(vars(temp), mean, na.rm = TRUE) %>%
  spread(year, temp)

# # A tibble: 19 x 17
# # Groups:   x, y [19]
#        x     y `1980` `1981` `1982` `1983` `1984` `1985` `1986` `1987` `1988`
#    <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
#  1  8.88  54.4   7.79   8.02   8.76   9.20   8.32   7.51   7.88   7.43   9.20
#  2  8.88  54.9   7.54   7.61   8.41   8.84   8.15   7.15   7.53   7.15   8.97
#  3  9.12  54.4   7.65   7.86   8.62   9.05   8.17   7.34   7.70   7.28   9.01
#  4  9.12  54.6   7.44   7.59   8.38   8.81   8.02   7.11   7.50   7.13   8.88
#  5  9.12  54.9   7.33   7.36   8.25   8.67   8.02   7.05   7.49   7.10   8.91
#  6  9.38  54.4   7.69   7.91   8.61   9.02   8.15   7.31   7.69   7.24   8.98
#  7  9.38  54.6   7.45   7.62   8.46   8.85   8.05   7.16   7.59   7.18   8.92
#  8  9.38  54.9   7.24   7.29   8.21   8.62   7.95   7.04   7.56   7.15   8.94
#  9  9.62  54.4   7.65   7.90   8.60   9.01   8.14   7.24   7.64   7.16   8.93
# 10  9.62  54.6   7.39   7.60   8.45   8.82   8.01   7.10   7.56   7.12   8.86
# 11  9.62  54.9   7.28   7.38   8.28   8.69   7.98   7.07   7.61   7.18   8.96
# 12  9.88  54.4   7.70   8.00   8.69   9.14   8.23   7.36   7.76   7.23   9.03
# 13  9.88  54.6   7.40   7.65   8.46   8.87   8.05   7.11   7.58   7.12   8.87
# 14 10.1   54.4   7.76   8.12   8.78   9.21   8.30   7.49   7.90   7.34   9.08
# 15 10.4   54.4   7.66   8.09   8.70   9.17   8.23   7.41   7.87   7.29   9.03
# 16 11.1   54.9   7.61   8.14   8.74   9.14   8.33   7.32   7.92   7.22   9.17
# 17 11.4   54.9   7.59   8.17   8.74   9.14   8.32   7.29   7.92   7.20   9.17
# 18 11.9   54.9   7.54   8.15   8.71   9.10   8.28   7.19   7.85   7.15   9.10
# 19 12.1   54.9   7.52   8.12   8.69   9.08   8.27   7.12   7.80   7.11   9.05
# # ... with 6 more variables: `1989` <dbl>, `1990` <dbl>, `1991` <dbl>,
# #   `1992` <dbl>, `1993` <dbl>, `1994` <dbl>

库（tidyverse）
图书馆（lubridate）
演示\u数据%>%
聚集（日期，温度，-x，-y）%>%
突变（日期=ymd（str_remove（日期，“X”）））%>%
突变（年=年（日））%>%
按（x，y，年份）划分的组别百分比>
总结（变量（温度）、平均值、na.rm=真实值）%>%
价差（年份、温度）
##A tibble:19 x 17
##组：x，y[19]
#x y`1980``1981``1982``1983``1984``1985``1986``1987``1988`
#                       
#  1  8.88  54.4   7.79   8.02   8.76   9.20   8.32   7.51   7.88   7.43   9.20
#  2  8.88  54.9   7.54   7.61   8.41   8.84   8.15   7.15   7.53   7.15   8.97
#  3  9.12  54.4   7.65   7.86   8.62   9.05   8.17   7.34   7.70   7.28   9.01
#  4  9.12  54.6   7.44   7.59   8.38   8.81   8.02   7.11   7.50   7.13   8.88
#  5  9.12  54.9   7.33   7.36   8.25   8.67   8.02   7.05   7.49   7.10   8.91
#  6  9.38  54.4   7.69   7.91   8.61   9.02   8.15   7.31   7.69   7.24   8.98
#  7  9.38  54.6   7.45   7.62   8.46   8.85   8.05   7.16   7.59   7.18   8.92
#  8  9.38  54.9   7.24   7.29   8.21   8.62   7.95   7.04   7.56   7.15   8.94
#  9  9.62  54.4   7.65   7.90   8.60   9.01   8.14   7.24   7.64   7.16   8.93
# 10  9.62  54.6   7.39   7.60   8.45   8.82   8.01   7.10   7.56   7.12   8.86
# 11  9.62  54.9   7.28   7.38   8.28   8.69   7.98   7.07   7.61   7.18   8.96
# 12  9.88  54.4   7.70   8.00   8.69   9.14   8.23   7.36   7.76   7.23   9.03
# 13  9.88  54.6   7.40   7.65   8.46   8.87   8.05   7.11   7.58   7.12   8.87
# 14 10.1   54.4   7.76   8.12   8.78   9.21   8.30   7.49   7.90   7.34   9.08
# 15 10.4   54.4   7.66   8.09   8.70   9.17   8.23   7.41   7.87   7.29   9.03
# 16 11.1   54.9   7.61   8.14   8.74   9.14   8.33   7.32   7.92   7.22   9.17
# 17 11.4   54.9   7.59   8.17   8.74   9.14   8.32   7.29   7.92   7.20   9.17
# 18 11.9   54.9   7.54   8.15   8.71   9.10   8.28   7.19   7.85   7.15   9.10
# 19 12.1   54.9   7.52   8.12   8.69   9.08   8.27   7.12   7.80   7.11   9.05
# # ... 还有6个变量：`1989`、`1990`、`1991`，
# #   `1992` , `1993` , `1994`

到目前为止您尝试了什么？@hpesoj626我查看了

dplyr:：summary

但是天数每年都会略有变化，而且我没有得到干净的输出。有什么想法吗？@hpesoj626为什么地理坐标对x，y被舍入？我