Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 通过组合所有数据帧中的类似条目,从列表中形成新的数据帧_R_List_Lapply_Cbind - Fatal编程技术网

R 通过组合所有数据帧中的类似条目,从列表中形成新的数据帧

R 通过组合所有数据帧中的类似条目,从列表中形成新的数据帧,r,list,lapply,cbind,R,List,Lapply,Cbind,我有一个列表,下面提供了许多数据帧示例 G100=structure(list(Return.Period = structure(c(4L, 6L, 2L, 3L, 5L, 1L), .Label = c("100yrs", "10yrs", "20yrs", "2yrs", "50yrs", "5yrs"), class = "factor"), X95..lower.CI = c(54.3488053692529, 73.333

我有一个列表,下面提供了许多数据帧示例

    G100=structure(list(Return.Period = structure(c(4L, 6L, 2L, 3L, 5L, 
        1L), .Label = c("100yrs", "10yrs", "20yrs", "2yrs", "50yrs", 
        "5yrs"), class = "factor"), X95..lower.CI = c(54.3488053692529, 
        73.33363378538, 84.0868168935697, 91.6191228597281, 96.3360349026068, 
        95.4278817251266), Estimate = c(61.6857930414643, 84.8210149260708, 
        101.483909733627, 118.735593472652, 143.33257990536, 163.806035490329
        ), X95..upper.CI = c(69.0227807136758, 96.3083960667617, 118.881002573685, 
        145.852064085577, 190.329124908114, 232.18418925553)), .Names = c("Return.Period", 
        "X95..lower.CI", "Estimate", "X95..upper.CI"), row.names = c(NA, 
        -6L), class = "data.frame")

G101<-G100 # just for illustration

mylist=list(G100,G101) # there 100 of these with differet codes
请注意,站点与mylist中的数据帧名称相同

对X95..lower.CI和X95..upper.CI执行相同的操作

因此,我将得到3个数据帧估计值,X95..lower.CIandX95..upper.CI。与上面的布局

#lapply, rbindlist,cbind and others can do but how?

请提供建议。

只需使用for循环添加名称即可。可能有一种奇特的应用方式,但for很容易使用、记住和理解

首先添加名称: 如前所述添加站点列: 组合数据帧: 因为您有很多数据帧或者它们相当大,所以使用dplyr::rbind_all来提高速度。在base R中,do.callrbind可以使用mylist,但速度较慢

library(dplyr)
combined = bind_rows(mylist)
dplyr的旧版本可以使用rbind_all而不是bind_行,但这很快就会被弃用:

将估算和CI列从长列转换为宽列。 这对于tidyr来说很容易,但Reformate2::dcast的工作原理类似:

library(tidyr)
Estimate = combined %>% select(SITE, Return.Period, Estimate) %>%
    spread(key = Return.Period, value = Estimate)
head(Estimate)
# Source: local data frame [2 x 7]
#
#   SITE  100yrs    10yrs    20yrs     2yrs    50yrs     5yrs
# 1 G100 163.806 101.4839 118.7356 61.68579 143.3326 84.82101
# 2 G101 163.806 101.4839 118.7356 61.68579 143.3326 84.82101    

Lower95 = combined %>% select(SITE, Return.Period, X95..lower.CI) %>%
    spread(key = Return.Period, value = X95..lower.CI)
head(Lower95)
# Source: local data frame [2 x 7]
#
#   SITE   100yrs    10yrs    20yrs     2yrs    50yrs     5yrs
# 1 G100 95.42788 84.08682 91.61912 54.34881 96.33603 73.33363
# 2 G101 95.42788 84.08682 91.61912 54.34881 96.33603 73.33363
您可能希望按非字母顺序对列重新排序

对“`X95..upper.CI”执行相同的操作

仍然留给读者作为练习。

如果我在mylist上使用上面的循环,数据帧将标记为[[1]]和[[2]],但我需要[[G100]]和[[G101]]。记住,我有超过100个这样的文件。其次,不要只是按照您的建议组合数据帧,而是估计X95..lower.CI和X95..upper.CI最终数据帧应该按照上面的安排。基本上,我想转到每个数据帧,为了进行估算,提取X2yrs X5yrs X10yrs X20yrs X50yrs X100yrs并形成一个表,如上图所示。对X95..lower.CI和X95..upper.CI执行相同的操作。我的结果将是3个数据帧。
for (i in seq_along(mylist)) {
    mylist[[i]]$SITE = names(mylist)[i]
}
library(dplyr)
combined = bind_rows(mylist)
library(tidyr)
Estimate = combined %>% select(SITE, Return.Period, Estimate) %>%
    spread(key = Return.Period, value = Estimate)
head(Estimate)
# Source: local data frame [2 x 7]
#
#   SITE  100yrs    10yrs    20yrs     2yrs    50yrs     5yrs
# 1 G100 163.806 101.4839 118.7356 61.68579 143.3326 84.82101
# 2 G101 163.806 101.4839 118.7356 61.68579 143.3326 84.82101    

Lower95 = combined %>% select(SITE, Return.Period, X95..lower.CI) %>%
    spread(key = Return.Period, value = X95..lower.CI)
head(Lower95)
# Source: local data frame [2 x 7]
#
#   SITE   100yrs    10yrs    20yrs     2yrs    50yrs     5yrs
# 1 G100 95.42788 84.08682 91.61912 54.34881 96.33603 73.33363
# 2 G101 95.42788 84.08682 91.61912 54.34881 96.33603 73.33363