R 将多个变量的重复度量扩展为宽格式时,NA值和额外行?

R 将多个变量的重复度量扩展为宽格式时,NA值和额外行?,r,dplyr,R,Dplyr,有了下面的数据(包括在dput中),我有三个人的不同数量的重复Lat和长位置,我想使用dplyr将它们扩展成广泛的格式 数据如下所示: > head(Dat) IndIDII IndYear WintLat WintLong 1 BHS_265 BHS_265-2015 47.61025 -112.7210 2 BHS_265 BHS_265-2016 47.59884 -112.7089 3 BHS_770 BHS_770-2016 42.97379 -109.0400

有了下面的数据(包括在
dput
中),我有三个人的不同数量的重复Lat和长位置,我想使用
dplyr
将它们扩展成广泛的格式

数据如下所示:

> head(Dat)
  IndIDII      IndYear  WintLat  WintLong
1 BHS_265 BHS_265-2015 47.61025 -112.7210
2 BHS_265 BHS_265-2016 47.59884 -112.7089
3 BHS_770 BHS_770-2016 42.97379 -109.0400
4 BHS_770 BHS_770-2017 42.97129 -109.0367
5 BHS_770 BHS_770-2018 42.97244 -109.0509
6 BHS_377 BHS_377-2015 43.34744 -109.4821
提供了一个灵巧的解决方案,这是一个很大的帮助。尽管如此,我还是无法获得我想要的结果。修改代码时,我有以下几点:

Dat %>%  
  group_by(IndIDII) %>%
  #Make YearNum (as intiger not calnader year) for each IndIDII
  mutate(YearNum = row_number()) %>% 
  gather(Group, LatLong, c(WintLat,  WintLong)) %>% 
  unite(GroupNew, YearNum, Group, sep = "-") %>% 
  spread(GroupNew, LatLong) %>% 
  as.data.frame()
它产生了一个几乎正确的结果,但每个
IndIDII
都有多行,每个行都包含一年的lat和long

  IndIDII      IndYear 1-WintLat 1-WintLong 2-WintLat 2-WintLong 3-WintLat 3-WintLong 4-WintLat 4-WintLong
1 BHS_265 BHS_265-2015  47.61025  -112.7210        NA         NA        NA         NA        NA         NA
2 BHS_265 BHS_265-2016        NA         NA  47.59884  -112.7089        NA         NA        NA         NA
3 BHS_377 BHS_377-2015  43.34744  -109.4821        NA         NA        NA         NA        NA         NA
4 BHS_377 BHS_377-2016        NA         NA  43.35559  -109.4445        NA         NA        NA         NA
5 BHS_377 BHS_377-2017        NA         NA        NA         NA  43.35195  -109.4566        NA         NA
6 BHS_377 BHS_377-2018        NA         NA        NA         NA        NA         NA  43.34765  -109.4892
7 BHS_770 BHS_770-2016  42.97379  -109.0400        NA         NA        NA         NA        NA         NA
8 BHS_770 BHS_770-2017        NA         NA  42.97129  -109.0367        NA         NA        NA         NA
9 BHS_770 BHS_770-2018        NA         NA        NA         NA  42.97244  -109.0509        NA         NA
我正在尝试将所有lat和longs的
IndIDII
放在一行中(即宽格式),如下所示<代码>NA值将在个人的年数少于最大年数时出现。我怀疑问题出在
GroupNew
变量上,并尝试了不同的选项,但没有效果


Dat你就快到了。
lat
long
进入不同的行,因为它们的
IndYear
不同。由于在最终的
数据框中仅为每个
IndiDII
保留
IndYear
的第一个值,因此添加
IndYear=first(IndYear)
将得到所需的结果

Dat %>%  
    group_by(IndIDII) %>%
    mutate(YearNum = row_number(), IndYear = first(IndYear)) %>% 
    gather(Group, LatLong, c(WintLat,  WintLong)) %>% 
    unite(GroupNew, YearNum, Group, sep = "-") %>% 
    spread(GroupNew, LatLong) %>% 
    as.data.frame()

#   IndIDII      IndYear 1-WintLat 1-WintLong 2-WintLat 2-WintLong 3-WintLat 3-WintLong 4-WintLat 4-WintLong
# 1 BHS_265 BHS_265-2015  47.61025  -112.7210  47.59884  -112.7089        NA         NA        NA         NA
# 2 BHS_377 BHS_377-2015  43.34744  -109.4821  43.35559  -109.4445  43.35195  -109.4566  43.34765  -109.4892
# 3 BHS_770 BHS_770-2016  42.97379  -109.0400  42.97129  -109.0367  42.97244  -109.0509        NA         NA
Dat %>%  
    group_by(IndIDII) %>%
    mutate(YearNum = row_number(), IndYear = first(IndYear)) %>% 
    gather(Group, LatLong, c(WintLat,  WintLong)) %>% 
    unite(GroupNew, YearNum, Group, sep = "-") %>% 
    spread(GroupNew, LatLong) %>% 
    as.data.frame()

#   IndIDII      IndYear 1-WintLat 1-WintLong 2-WintLat 2-WintLong 3-WintLat 3-WintLong 4-WintLat 4-WintLong
# 1 BHS_265 BHS_265-2015  47.61025  -112.7210  47.59884  -112.7089        NA         NA        NA         NA
# 2 BHS_377 BHS_377-2015  43.34744  -109.4821  43.35559  -109.4445  43.35195  -109.4566  43.34765  -109.4892
# 3 BHS_770 BHS_770-2016  42.97379  -109.0400  42.97129  -109.0367  42.97244  -109.0509        NA         NA