Mutate以错误的顺序返回数据dplyr这是错误吗?

Mutate以错误的顺序返回数据dplyr这是错误吗?,r,dplyr,mutate,R,Dplyr,Mutate,我遇到了一个问题,dplyr中的mutate()以错误的顺序返回结果。我对mutate的调用使用现有列中的数据作为输入,但返回的结果的排列方式就像数据在mutate之前排序一样 我的特刊使用dataRetrieval包从web上获取美国地质调查局/核武器信息系统数据。在本例中,我根据站点ID检索站点名称。在` dataRetreival包中,站点ID是存储为字符的数字代码 library(dataRetrieval) library(dplyr) Gauges <- tibble( Na

我遇到了一个问题,dplyr中的
mutate()
以错误的顺序返回结果。我对
mutate
的调用使用现有列中的数据作为输入,但返回的结果的排列方式就像数据在
mutate
之前排序一样

我的特刊使用
dataRetrieval
包从web上获取美国地质调查局/核武器信息系统数据。在本例中,我根据站点ID检索站点名称。在` dataRetreival包中,站点ID是存储为字符的数字代码

library(dataRetrieval)
library(dplyr)

Gauges <- tibble( Name = c("Twisp", "Chewuch", "Andrews" ,"Met@Winthrop", "Met@Twisp", "Met@Pateros", "Met@Goat"),
                  ID = c("12448998" , "12448000","12447390", "12448500" ,"12449500","12449950" , "12447383")
)

## This works correctly with each of the station numbers
readNWISsite(Gauges$ID[1])$station_nm
# [1] "TWISP RIVER NEAR TWISP, WA"

## This does not work correctly
## Order is not right! Station does not correspond with ID  !!
Gauges%>%
      mutate(Station = readNWISsite(ID)$station_nm)

# # A tibble: 7 x 3
# Name         ID       Station                                      
# <chr>        <chr>    <chr>                                        
# 1 Twisp        12448998 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA
# 2 Chewuch      12448000 ANDREWS CREEK NEAR MAZAMA, WA                
# 3 Andrews      12447390 CHEWUCH RIVER AT WINTHROP, WA                
# 4 Met@Winthrop 12448500 METHOW RIVER AT WINTHROP, WA                 
# 5 Met@Twisp    12449500 TWISP RIVER NEAR TWISP, WA                   
# 6 Met@Pateros  12449950 METHOW RIVER AT TWISP, WA                    
# 7 Met@Goat     12447383 METHOW RIVER NEAR PATEROS, WA    

## This works, returning the correct site associated with the gauge number
Gauges%>%
      arrange(ID) %>%
      mutate(Station = readNWISsite(ID)$station_nm)
# # A tibble: 7 x 3
# Name         ID       Station                                      
# <chr>        <chr>    <chr>                                        
# 1 Met@Goat     12447383 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA
# 2 Andrews      12447390 ANDREWS CREEK NEAR MAZAMA, WA                
# 3 Chewuch      12448000 CHEWUCH RIVER AT WINTHROP, WA                
# 4 Met@Winthrop 12448500 METHOW RIVER AT WINTHROP, WA                 
# 5 Twisp        12448998 TWISP RIVER NEAR TWISP, WA                   
# 6 Met@Twisp    12449500 METHOW RIVER AT TWISP, WA                    
# 7 Met@Pateros  12449950 METHOW RIVER NEAR PATEROS, WA  
库(数据检索)
图书馆(dplyr)
仪表%
变异(站点=读取权限(ID)$站点\u nm)
##tibble:7 x 3
#名称ID站
#                                                     
#1 Twisp 12448998位于华盛顿州马扎马附近山羊溪上方的METHOW河
#2佐治亚州马扎马附近的丘乌克12448000安德鲁斯溪
#3安德鲁斯12447390位于华盛顿州温斯洛普的丘乌克河
# 4 Met@Winthrop华盛顿州温斯洛普的12448500米休河
# 5 Met@Twisp华盛顿州TWISP附近的TWISP河12449500
# 6 Met@Pateros12449950华盛顿州TWISP的METHOW河
# 7 Met@Goat华盛顿州佩特罗斯附近的12447383米休河
##这样做可以返回与仪表编号关联的正确位置
仪表%>%
排列(ID)%>%
变异(站点=读取权限(ID)$站点\u nm)
##tibble:7 x 3
#名称ID站
#                                                     
# 1 Met@Goat12447383西澳州马扎马附近山羊溪上方的METHOW河
#2安德鲁斯12447390安德鲁斯溪,靠近华盛顿州马扎马
#3位于华盛顿州温斯洛普的丘布赫12448000丘布赫河
# 4 Met@Winthrop华盛顿州温斯洛普的12448500米休河
#5 Twisp 12448998 Twisp河,靠近华盛顿州Twisp
# 6 Met@Twisp华盛顿州TWISP的12449500 METHOW河
# 7 Met@Pateros华盛顿州佩特罗斯附近的12449950米休河

<>为什么<代码>突变在过程中间重新排列数据?或者,这里发生了什么?

要了解发生了什么,不要只提取“站点号”,还要获取“站点号”

library(dplyr)
library(dataRetrieval)
readNWISsite(Gauges$ID)[c('site_no', 'station_nm')]
#site_no                                    station_nm
#1 12447383 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA
#2 12447390                 ANDREWS CREEK NEAR MAZAMA, WA
#3 12448000                 CHEWUCH RIVER AT WINTHROP, WA
#4 12448500                  METHOW RIVER AT WINTHROP, WA
#5 12448998                    TWISP RIVER NEAR TWISP, WA
#6 12449500                     METHOW RIVER AT TWISP, WA
#7 12449950                 METHOW RIVER NEAR PATEROS, WA
这里,“站点号”是基于“ID”的整数值排序的。要纠正这一点,我们可以使用
行方式

Gauges %>% 
    rowwise() %>% 
    mutate(Station = readNWISsite(ID)$station_nm)

map
from
purrr

library(purrr)
Gauges %>%
    mutate(Station = map_chr(ID, ~ readNWISsite(.x)$station_nm))
# A tibble: 7 x 3
#  Name         ID       Station                                      
#  <chr>        <chr>    <chr>                                        
#1 Twisp        12448998 TWISP RIVER NEAR TWISP, WA                   
#2 Chewuch      12448000 CHEWUCH RIVER AT WINTHROP, WA                
#3 Andrews      12447390 ANDREWS CREEK NEAR MAZAMA, WA                
#4 Met@Winthrop 12448500 METHOW RIVER AT WINTHROP, WA                 
#5 Met@Twisp    12449500 METHOW RIVER AT TWISP, WA                    
#6 Met@Pateros  12449950 METHOW RIVER NEAR PATEROS, WA                
#7 Met@Goat     12447383 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA

感谢@akrun,
mutate()
是否默认对输出进行排序,或者当输入列是存储为字符或其他形式的数字时,这是一种特殊情况?@BrianFisher否,
mutate
将以相同的顺序返回行,但是在这里,
readNWISsite
的输出是一个包含许多列的
数据框架,排序是在这个层次上进行的谢谢你@akrun,我想这是我错过的部分。
Gauges %>% 
          mutate(Station = {
           tmp <- readNWISsite(ID)[c('site_no', 'station_nm')]
              tmp$station_nm[match(ID, tmp$site_no)]})
# A tibble: 7 x 3
#  Name         ID       Station                                      
#  <chr>        <chr>    <chr>                                        
#1 Twisp        12448998 TWISP RIVER NEAR TWISP, WA                   
#2 Chewuch      12448000 CHEWUCH RIVER AT WINTHROP, WA                
#3 Andrews      12447390 ANDREWS CREEK NEAR MAZAMA, WA                
#4 Met@Winthrop 12448500 METHOW RIVER AT WINTHROP, WA                 
#5 Met@Twisp    12449500 METHOW RIVER AT TWISP, WA                    
#6 Met@Pateros  12449950 METHOW RIVER NEAR PATEROS, WA                
#7 Met@Goat     12447383 METHOW RIVER ABOVE GOAT CREEK NEAR MAZAMA, WA