R 将函数的相关输出放入数据帧_R_Function_Loops_Dataframe_Correlation

R 将函数的相关输出放入数据帧

r function loops dataframe

R 将函数的相关输出放入数据帧,r,function,loops,dataframe,correlation,R,Function,Loops,Dataframe,Correlation,随着时间的推移，我对一个群体进行了大量的关联。我已经相应地将它们拆分，并使用lappy将它们放入一个函数中。我想将每个相关性的输出放在一个数据框中（即：每一行都是一个相关性的信息，列有：相关性的名称、p值、t统计、df、CIs和corcoeff）我有两个问题：我不知道如何提取分割中的相关名称我可以让我的函数在分割上运行相关（600+相关），但我不能让它打印到数据帧中。澄清一下：当我在没有循环的情况下运行函数时，它会为每个组执行所有600个关联。但是，当我添加循环时，它会为拆分中的所有组生成

随着时间的推移，我对一个群体进行了大量的关联。我已经相应地将它们拆分，并使用lappy将它们放入一个函数中。我想将每个相关性的输出放在一个数据框中（即：每一行都是一个相关性的信息，列有：相关性的名称、p值、t统计、df、CIs和corcoeff）

我有两个问题：

我不知道如何提取分割中的相关名称

我可以让我的函数在分割上运行相关（600+相关），但我不能让它打印到数据帧中。澄清一下：当我在没有循环的情况下运行函数时，它会为每个组执行所有600个关联。但是，当我添加循环时，它会为拆分中的所有组生成NULL

以下是我迄今为止所做的：

> head(Birds) #Shortened for this Post
Location      Species   Year Longitude Latitude Section Total Percent  Family
1 Chiswell A  Kittiwake 1976 -149.5847 59.59559 Central   310 16.78397 Gull

BigSplit<-split(Birds,list(Birds$Family, Birds$Location, 
Birds$Section,Birds$Species), drop=T) #A list of Dataframes

#Make empty data frame
resultcor <- data.frame(Name = character(),
                        tvalue = character(),
                        degreeF = character(),
                        pvalue = character(),
                        CIs = character(),
                        corcoeff = character(),stringsAsFactors = F)

WorkFunc <- function(dataset) {
     data.name = substitute(dataset) #Use "dataset" as substitute for actual dataset name

     #Correlation between Year and population Percent
     try({
          correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson")    
     }, silent = TRUE)

     for (i in 1:nrow(resultcor)) {
          resultcor$Name[i] <- ??? #These ??? are not in the code, just highlighting Issue 1
          resultcor$tvalue[i] <- correlation$dataset$statistic
          resultcor$degreeF[i] <- correlation$dataset$parameter
          resultcor$pvalue[i] <- correlation$dataset$p.value
          resultcor$CIs[i] <- correlation$dataset$conf.int
          resultcor$corcoeff[i] <- correlation$dataset$estimate
     }
}

lapply(BigSplit, WorkFunc)

>head（Birds）#本帖简称
位置物种年经纬度剖面总百分比科
1 Chiswell A Kittiwake 1976-149.5847 59.59559中部310 16.78397海鸥
BigSplit考虑使用Map
（mappy的变体mappy
）在这里传递BigSplit的元素和名称。使用Map
将输出数据帧列表，然后可以使用do.call（）
在末尾进行行绑定。下面假设BigSplit是一个命名列表
WorkFunc <- function(dataset, dataname) {
    # Correlation between Year and population Percent
    tryCatch({ 
        correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson")
        CIs <- correlation$conf.int

        return(data.frame(
                  Name = dataname,
                  tvalue = correlation$statistic,
                  degreeF = correlation$parameter,
                  pvalue = correlation$p.value,
                  CI_lower = ifelse(is.null(CIs), NA, CIs[[1]]),
                  CI_higher = ifelse(is.null(CIs), NA, CIs[[2]]),
                  corcoeff = correlation$estimate
             )
         ) 
     }, error = function(e) 
             return(data.frame(
                        Name = character(0),
                        tvalue = numeric(0),
                        degreeF = numeric(0),
                        pvalue = numeric(0),
                        CI_lower = numeric(0),
                        CI_higher = numeric(0),
                        corcoeff = numeric(0)
                    )
              )
      )
}    

# BUILD CORRELATION DATAFRAMES INTO LIST
cor_df_list <- Map(WorkFunc, BigSplit, names(BigSplit))
cor_df_list <- mapply(WorkFunc, BigSplit, names(BigSplit), SIMPLIFY=FALSE)   # EQUIVALENT

# ROW BIND ALL DATAFRAMES TO FINAL LARGE DATAFRAME
finaldf <- do.call(rbind, cor_df_list)

WorkFunc考虑使用Map
（mappy的变体mappy
）传递bigspilt的元素和名称。使用Map
将输出数据帧列表，然后可以使用do.call（）
在末尾进行行绑定。下面假设BigSplit是一个命名列表
WorkFunc <- function(dataset, dataname) {
    # Correlation between Year and population Percent
    tryCatch({ 
        correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson")
        CIs <- correlation$conf.int

        return(data.frame(
                  Name = dataname,
                  tvalue = correlation$statistic,
                  degreeF = correlation$parameter,
                  pvalue = correlation$p.value,
                  CI_lower = ifelse(is.null(CIs), NA, CIs[[1]]),
                  CI_higher = ifelse(is.null(CIs), NA, CIs[[2]]),
                  corcoeff = correlation$estimate
             )
         ) 
     }, error = function(e) 
             return(data.frame(
                        Name = character(0),
                        tvalue = numeric(0),
                        degreeF = numeric(0),
                        pvalue = numeric(0),
                        CI_lower = numeric(0),
                        CI_higher = numeric(0),
                        corcoeff = numeric(0)
                    )
              )
      )
}    

# BUILD CORRELATION DATAFRAMES INTO LIST
cor_df_list <- Map(WorkFunc, BigSplit, names(BigSplit))
cor_df_list <- mapply(WorkFunc, BigSplit, names(BigSplit), SIMPLIFY=FALSE)   # EQUIVALENT

# ROW BIND ALL DATAFRAMES TO FINAL LARGE DATAFRAME
finaldf <- do.call(rbind, cor_df_list)

WorkFunc查看包broom
它为您完成所有这些。在哪里进行拆分？请出示密码。我无法将其打印到数据框中…请解释发生了什么。什么是BigSplit，一个数据帧列表？@Parfait为了清晰起见，我对其进行了编辑。是，拆分数据帧列表。Thanks@sinQueso扫帚包正是我想要的，有没有添加“数据名”的方法作为其中一列，然后用函数将它们全部添加到一起？@LearningTheMacros查看本书章节，它介绍了您试图签出的程序包扫帚
它为您完成所有这些。拆分在哪里进行？请出示密码。我无法将其打印到数据框中…请解释发生了什么。什么是BigSplit，一个数据帧列表？@Parfait为了清晰起见，我对其进行了编辑。是，拆分数据帧列表。Thanks@sinQueso扫帚包正是我想要的，有没有办法添加“数据名”作为列之一，然后用一个函数将它们全部添加到一起？@LearningTheMacros查看本书的这一章，它介绍了您试图做的事情我遵循了这段代码，但我似乎得到了这个错误：error in data.frame（Name=dataname，tvalue=correlation$statistic，degreeF=correlation$parameter，：参数表示不同的行数：1，0另外：警告消息：1:在data.frame中（Name=dataname，tvalue=correlation$statistic，degreeF=correlation$parameter，：行名称是从短变量中找到的，已被丢弃2:在data.frame中（Name=dataname，tvalue=correlation$statistic，degreeF=correlation$parameter，：行名称是从一个短变量中找到的，已被丢弃。我忘记在data.frame（）中为单独的值添加逗号
。哦，不，我也遇到了这个错误，但我理解并添加了逗号，这个错误是在添加了逗号之后出现的。请参阅更新。CI实际上是一个包含两个较低和较高列的列表。我添加了这样的列。另一个原因也可能是关联失败。我注意到您使用了try
。我添加了tryCatch（）
返回错误时的空数据帧。拆分dfs可能只有一行！我遵循了此代码，但似乎遇到了以下错误：data.frame中的错误（Name=dataname，tvalue=correlation$statistic，degreeF=correlation$parameter，：参数表示不同的行数：1，0另外：警告消息：1:in data.frame（Name=dataname，tvalue=correlation$statistic，degreeF=correlation$parameter，：行名称是从短变量中找到的，已被丢弃2:在data.frame中（Name=dataname，tvalue=correlation$statistic，degreeF=correlation$parameter，：行名称是从一个短变量中找到的，已被丢弃。我忘记在data.frame（）中为单独的值添加逗号
。哦，不，我也遇到了这个错误，但我理解并添加了逗号，这个错误是在添加了逗号之后出现的。请参阅更新。CI实际上是一个包含两个较低和较高列的列表。我添加了这样的列。另一个原因也可能是关联失败。我注意到您使用了try
。我添加了tryCatch（）
返回错误时的空数据帧。拆分dfs只能有一行！