R 时空包的问题

R 时空包的问题,r,spatial,sf,sp,rgdal,R,Spatial,Sf,Sp,Rgdal,我想每月对德国各县的PM10进行一次时空分析,并绘制它们的图。稍后我想分析不同的回归模型。但是我不能创造一个时空物体,我需要它来做进一步的分析和其他研究。所以,我首先开始了解方法和包,就我所能做到的而言,我陷入了无法创建一个合适的时空对象的困境 我将以下可复制代码作为指导(来源:): 由于我的df不包括县一级的地理参考,而是包含站点代码,因此我已将此信息添加到数据集中。我的sp文件中的县ID是CC_2,如果ID有四位数字,则这是一个以0开头的五位代码。例如: de$CC_2 [1] &quo

我想每月对德国各县的PM10进行一次时空分析,并绘制它们的图。稍后我想分析不同的回归模型。但是我不能创造一个时空物体,我需要它来做进一步的分析和其他研究。所以,我首先开始了解方法和包,就我所能做到的而言,我陷入了无法创建一个合适的时空对象的困境

我将以下可复制代码作为指导(来源:):

由于我的df不包括县一级的地理参考,而是包含站点代码,因此我已将此信息添加到数据集中。我的sp文件中的县ID是CC_2,如果ID有四位数字,则这是一个以0开头的五位代码。例如:

de$CC_2
  [1] "08425" "08211" "08426" "08115" "12065" "12066" "12067"
我猜第一个问题是,当我通过站点代码将地理信息添加到我的df时,我在df中得到了我的CC_2,如下所示:

> de
class       : SpatialPolygonsDataFrame 
features    : 403 
extent      : 5.866251, 15.04181, 47.27012, 55.05653  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 13
names       : GID_0,  NAME_0,   GID_1,            NAME_1, NL_NAME_1,     GID_2,    NAME_2, VARNAME_2, NL_NAME_2,     TYPE_2,  ENGTYPE_2,  CC_2,   HASC_2 
min values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.1_1, Ahrweiler,        NA,        NA,      Kreis,   District, 01001, DE.BB.BH 
max values  :   DEU, Germany, DEU.9_1,         Thüringen,        NA, DEU.9.9_1,   Zwickau,        NA,        NA, Water body, Water body, 16077, DE.TH.WR
> PM10_m[sample(nrow(PM10_m),3),]
      Station Komponente      Datum         TYPEOFAREA TYPEOFSTATION       TMW TMW_R TypeOfData Lieferung  CC_2
11448 DEBW081       PM10 2020-06-07 städtisches Gebiet   Hintergrund  6.775362     7          T         M  8212
1566  DEBB066       PM10 2020-04-19  ländlich regional   Hintergrund 11.162500    11          S         M 12061
7174  DEBW027       PM10 2020-03-20 städtisches Gebiet   Hintergrund 34.791667    35          S         M  8415
如您所见,四位数ID开头的0缺失,因此我检查了变量的结构:

str(PM10_m$CC_2)
 chr [1:47350] "12062" "12062" "12062" "12062" "12062" "12062" "12062" "12062" "12062" "12062" "12062" "12062" "12062" ...


str(de$CC_2)
 chr [1:403] "08425" "08211" "08426" "08115" NA "08435" "08315" "08235" "08316" "08236" "08116" "08311" "08237" "08117" ...
因此,两者都是chr,但如果将它们匹配,则每四位ID将不匹配!所以,我曾经通过将两个变量都作为数字来处理这个问题。在这一点上,我不确定我这样做是否正确

> PM10_m$CC_2<-as.numeric(PM10_m$CC_2)
> de$CC_2.2<-as.numeric(de$CC_2)
它的工作原理与指南不同,但据我所知,它只是创建和分类时间对象。于是,我走到向导前面:

library(spacetime)

pm10.st = STFDF(de, time, PM10_f[order(PM10_f[4], PM10_f[1]),])
Error in validityMethod(object) : 
  nrow(object@data) == length(object@sp) * nrow(object@time) is not TRUE
我读到命令STFDF无法处理丢失的地质点,我必须使用命令STIDF

这就是我得到的:

pm10.st = STIDF(de, time, PM10_f[order(PM10_f[4], PM10_f[1]),])

> pm10.st
An object of class "STIDF"
Slot "data":
          date  KRS    TMW10 month month1
1   2020-01-01 1002 33.34608    01      1
183 2020-01-01 1003 81.06596    01      1
365 2020-01-01 1051 53.14400    01      1
547 2020-01-01 1053 34.36517    01      1
729 2020-01-01 1054      NaN    01      1
911 2020-01-01 1057 32.04604    01      1

Slot "sp":
class       : SpatialPolygonsDataFrame 
features    : 6 
extent      : 8.108812, 10.24141, 47.5024, 48.86768  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 14
names       : GID_0,  NAME_0,   GID_1,            NAME_1, NL_NAME_1,     GID_2,          NAME_2, VARNAME_2, NL_NAME_2,     TYPE_2,  ENGTYPE_2,  CC_2,   HASC_2, CC_2.2 
min values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.1_1, Alb-Donau-Kreis,        NA,        NA,  Landkreis,   District, 08115, DE.BW.AD,   8115 
max values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.6_1,   Bodenseekreis,        NA,        NA, Water body, Water body, 08435, DE.BW.BR,   8435 

Slot "time":
           timeIndex
0001-01-01         1
0002-01-01         2
0003-01-01         3
0004-01-01         4
0005-01-01         5
0006-01-01         6

Slot "endTime":
[1] "0001-01-01 GMT" "0002-01-01 GMT" "0003-01-01 GMT" "0004-01-01 GMT" "0005-01-01 GMT" "0006-01-01 GMT"
当我看到这个命令只从df中提取了6行,并且只与多边形df的6个特征相匹配时,我真的很惊讶。我可以画出这个STIDF:

但正如你所看到的,它不能正常工作。所以,我猜,我可以按月份和县ID汇总:

pm10.f<-aggregate(PM10_f$TMW10, by = list(PM10_f$month, PM10_f$KRS),FUN="mean", na.rm=T)

> str(pm10.f)
'data.frame':   1092 obs. of  3 variables:
 $ month: chr  "01" "02" "03" "04" ...
 $ CID  : num  1002 1002 1002 1002 1002 ...
 $ MMW10: num  13.3 11.1 14.2 16.1 12.4 ...

### CID is the County ID ###

> pm10.f[sample(nrow(pm10.f),5),]
     month   CID     MMW10
234     06  5158 16.637490
704     02  9775 11.083747
1030    04 16055 18.934881
842     02 13054  8.594628
513     03  8121 16.9119
我遇到了同样的问题,同样只有6个随机行与6个县匹配:

即使我删除了order命令,我也会遇到同样的问题,从df中只删除了6行,从多边形df中删除了6个特征:

pm10.stf = STIDF(de, time, pm10.f[order(pm10.f[1], pm10.f[1]),])

> pm10.stf
An object of class "STIDF"
Slot "data":
   month  CID    MMW10
1     01 1002 13.31264
7     01 1003 17.81540
13    01 1051 17.67919
19    01 1053 12.99228
25    01 1054      NaN
31    01 1057 14.71878

Slot "sp":
class       : SpatialPolygonsDataFrame 
features    : 6 
extent      : 8.108812, 10.24141, 47.5024, 48.86768  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 14
names       : GID_0,  NAME_0,   GID_1,            NAME_1, NL_NAME_1,     GID_2,          NAME_2, VARNAME_2, NL_NAME_2,     TYPE_2,  ENGTYPE_2,  CC_2,   HASC_2, CC_2.2 
min values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.1_1, Alb-Donau-Kreis,        NA,        NA,  Landkreis,   District, 08115, DE.BW.AD,   8115 
max values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.6_1,   Bodenseekreis,        NA,        NA, Water body, Water body, 08435, DE.BW.BR,   8435 

Slot "time":
           timeIndex
0001-01-01         1
0002-01-01         2
0003-01-01         3
0004-01-01         4
0005-01-01         5
0006-01-01         6

Slot "endTime":
[1] "0001-01-01 GMT" "0002-01-01 GMT" "0003-01-01 GMT" "0004-01-01 GMT" "0005-01-01 GMT" "0006-01-01 GMT"
pm10.stf = STIDF(de, time, pm10.f)

> pm10.stf
An object of class "STIDF"
Slot "data":
  month  CID    MMW10
1    01 1002 13.31264
2    02 1002 11.10590
3    03 1002 14.19649
4    04 1002 16.10512
5    05 1002 12.38511
6    06 1002 13.10104

Slot "sp":
class       : SpatialPolygonsDataFrame 
features    : 6 
extent      : 8.108812, 10.24141, 47.5024, 48.86768  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 14
names       : GID_0,  NAME_0,   GID_1,            NAME_1, NL_NAME_1,     GID_2,          NAME_2, VARNAME_2, NL_NAME_2,     TYPE_2,  ENGTYPE_2,  CC_2,   HASC_2, CC_2.2 
min values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.1_1, Alb-Donau-Kreis,        NA,        NA,  Landkreis,   District, 08115, DE.BW.AD,   8115 
max values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.6_1,   Bodenseekreis,        NA,        NA, Water body, Water body, 08435, DE.BW.BR,   8435 

Slot "time":
           timeIndex
0001-01-01         1
0002-01-01         2
0003-01-01         3
0004-01-01         4
0005-01-01         5
0006-01-01         6

Slot "endTime":
[1] "0001-01-01 GMT" "0002-01-01 GMT" "0003-01-01 GMT" "0004-01-01 GMT" "0005-01-01 GMT" "0006-01-01 GMT"

我在df中得到了一个县的6行,但不同的6个多边形特征。似乎STIDF命令只是从多边形df中提取了前6个多边形。首先,我注意到我的shapefile包含的元素比实际区域多。 这是因为shapefile包含“doublegeom”。因此,我将shapefile聚合如下:

raster::aggregate(de, by="AGS")
然后我突然意识到我在思维上有一个逻辑错误。所以我有401个地区,实际上有6个测量时间(6个月),所以我的数据框应该有401*6=2406行。这意味着我必须调整我的数据帧。因此,我选择了401个地区并将其扩大:

df<-tidyr::expand_grid(KRS=df$KRS,1:6)

df首先,我注意到我的shapefile中的元素比实际区域中的元素多。
这是因为shapefile包含“doublegeom”。因此,我将shapefile聚合如下:

raster::aggregate(de, by="AGS")
然后我突然意识到我在思维上有一个逻辑错误。所以我有401个地区,实际上有6个测量时间(6个月),所以我的数据框应该有401*6=2406行。这意味着我必须调整我的数据帧。因此,我选择了401个地区并将其扩大:

df<-tidyr::expand_grid(KRS=df$KRS,1:6)

d对你所做的很好的描述,但很难抓住问题的关键。在你描述你所做的事情之前,你能在文章的顶部简短地陈述一下问题/期望的结果吗?@AndreWildberg我编辑了我的问题,并添加了我的目标,我希望它能涵盖我所做的事情。谢谢你的建议!对你所做的很好的描述,但是很难抓住问题的关键。在你描述你所做的事情之前,你能在文章的顶部简短地陈述一下问题/期望的结果吗?@AndreWildberg我编辑了我的问题,并添加了我的目标,我希望它能涵盖我所做的事情。谢谢你的建议!
pm10.stf = STIDF(de, time, pm10.f[order(pm10.f[1], pm10.f[1]),])

> pm10.stf
An object of class "STIDF"
Slot "data":
   month  CID    MMW10
1     01 1002 13.31264
7     01 1003 17.81540
13    01 1051 17.67919
19    01 1053 12.99228
25    01 1054      NaN
31    01 1057 14.71878

Slot "sp":
class       : SpatialPolygonsDataFrame 
features    : 6 
extent      : 8.108812, 10.24141, 47.5024, 48.86768  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 14
names       : GID_0,  NAME_0,   GID_1,            NAME_1, NL_NAME_1,     GID_2,          NAME_2, VARNAME_2, NL_NAME_2,     TYPE_2,  ENGTYPE_2,  CC_2,   HASC_2, CC_2.2 
min values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.1_1, Alb-Donau-Kreis,        NA,        NA,  Landkreis,   District, 08115, DE.BW.AD,   8115 
max values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.6_1,   Bodenseekreis,        NA,        NA, Water body, Water body, 08435, DE.BW.BR,   8435 

Slot "time":
           timeIndex
0001-01-01         1
0002-01-01         2
0003-01-01         3
0004-01-01         4
0005-01-01         5
0006-01-01         6

Slot "endTime":
[1] "0001-01-01 GMT" "0002-01-01 GMT" "0003-01-01 GMT" "0004-01-01 GMT" "0005-01-01 GMT" "0006-01-01 GMT"
pm10.stf = STIDF(de, time, pm10.f)

> pm10.stf
An object of class "STIDF"
Slot "data":
  month  CID    MMW10
1    01 1002 13.31264
2    02 1002 11.10590
3    03 1002 14.19649
4    04 1002 16.10512
5    05 1002 12.38511
6    06 1002 13.10104

Slot "sp":
class       : SpatialPolygonsDataFrame 
features    : 6 
extent      : 8.108812, 10.24141, 47.5024, 48.86768  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
variables   : 14
names       : GID_0,  NAME_0,   GID_1,            NAME_1, NL_NAME_1,     GID_2,          NAME_2, VARNAME_2, NL_NAME_2,     TYPE_2,  ENGTYPE_2,  CC_2,   HASC_2, CC_2.2 
min values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.1_1, Alb-Donau-Kreis,        NA,        NA,  Landkreis,   District, 08115, DE.BW.AD,   8115 
max values  :   DEU, Germany, DEU.1_1, Baden-Württemberg,        NA, DEU.1.6_1,   Bodenseekreis,        NA,        NA, Water body, Water body, 08435, DE.BW.BR,   8435 

Slot "time":
           timeIndex
0001-01-01         1
0002-01-01         2
0003-01-01         3
0004-01-01         4
0005-01-01         5
0006-01-01         6

Slot "endTime":
[1] "0001-01-01 GMT" "0002-01-01 GMT" "0003-01-01 GMT" "0004-01-01 GMT" "0005-01-01 GMT" "0006-01-01 GMT"
raster::aggregate(de, by="AGS")
df<-tidyr::expand_grid(KRS=df$KRS,1:6)
df.stf <- STFDF(de2, time, df[order(df[2], df[1]),])