R 应用函数而不是使用循环_R_Function_Loops_Subset_Apply

R 应用函数而不是使用循环

r function loops

R 应用函数而不是使用循环,r,function,loops,subset,apply,R,Function,Loops,Subset,Apply,我在文档和论坛中搜索了很长时间，但是我仍然很难理解如何使用apply函数而不是R中的循环来实现更复杂的函数。（对于像apply（data，1，sum）这样的函数，尽管可以）例如，我有以下函数 AOV_GxT=function(Trial_group,trait,df){ sub_table=df[which(df$Trial.group == Trial_group),] aov_GxT = anova(aov(sub_table[,trait] ~ G

我在文档和论坛中搜索了很长时间，但是我仍然很难理解如何使用apply函数而不是R中的循环来实现更复杂的函数。（对于像apply（data，1，sum）这样的函数，尽管可以）

例如，我有以下函数

 AOV_GxT=function(Trial_group,trait,df){ 

        sub_table=df[which(df$Trial.group == Trial_group),]

        aov_GxT = anova(aov(sub_table[,trait] ~ Genotype + Treatment + Treatment/Rep.number + Genotype*Treatment, data=sub_table, na.action="na.omit"))

        pvalue = aov_GxT$"Pr(>F)"[2]
        return(c(Trial_group,trait,pvalue))  

    }

我想从数据框

df

因此，我通常会做以下工作（非常有效）：

我想限制循环的使用并学习如何使用

apply

系列函数，因此我创建了列表并尝试使用

mapply

：

trait_lst = list(colnames(df_vars_clean)[7:ncol(df_vars_clean)])
Tgrp_lst = list(unique(df_vars_clean[,'Trial.group']))

aov_table<-mapply(AOV_GxT(a,b,c),a=Tgrp_lst,b=trait_lst, c=dataset )

trait\u lst=list（colnames（df\u vars\u clean）[7:ncol（df\u vars\u clean）]）
Tgrp_lst=列表（唯一（df_vars_clean[，'Trial.group']））
aov_表正如评论中所指出的，ddply是一个不错的选择，但是，问题也很容易通过LAPPY解决：
do.call(rbind, lapply(split(dataset, dataset$Trial.Group), function(tgDf) {
  do.call(rbind, lapply(c("Trait1", "Trait2", "Trait3"), function(trait) {
      ## you don't need the trial group, it is already subsetted.
      AOV_gtx(trait, tgDf)
  }))
}))

使用ddply可以删除外部LAPPY/split代码：
ddply(dataset, "Trial.Group", function(tgDf) {
   ## the code in here would be the same, because you are iterating over
   ## the response cols.
})

所有这些函数的关键，以及一般的R，是不要预先分配数据结构来存储结果——它是功能性的，所以你要建立结果，然后返回它们。
也许你需要编写list（a=Tgrp\u lst，b=trait\u list，c=dataset）
在mapply
调用中。@Charles最好在您的情况下试用ddply。鉴于您的data.frame（df）的colname为“试验组”、“性状”、“基因型”、“治疗”、“Rep.number”my.func@xiaobei&jimmyb，感谢您的帮助，我不知道split和ddply函数。它工作得很好！
do.call(rbind, lapply(split(dataset, dataset$Trial.Group), function(tgDf) {
  do.call(rbind, lapply(c("Trait1", "Trait2", "Trait3"), function(trait) {
      ## you don't need the trial group, it is already subsetted.
      AOV_gtx(trait, tgDf)
  }))
}))

ddply(dataset, "Trial.Group", function(tgDf) {
   ## the code in here would be the same, because you are iterating over
   ## the response cols.
})