R 为什么有些代码运行得比其他代码快？_R_Function_Timing_Microbenchmark

R 为什么有些代码运行得比其他代码快？

r function

R 为什么有些代码运行得比其他代码快？,r,function,timing,microbenchmark,R,Function,Timing,Microbenchmark,我写了3个函数，它们都做同样的事情，我想知道为什么当我在微基准上标记它们时，我认为最快的那一个却没有。下面是我的代码，我只是想知道是否有人可以帮助解释为什么mymean是最快的 #First remove all non numeric columns numeric_clean_usnews <- cleaned_us_news[,-c((1:3),36)] #Removing any non numeric columns #using apply and mean() Averag

我写了3个函数，它们都做同样的事情，我想知道为什么当我在微基准上标记它们时，我认为最快的那一个却没有。下面是我的代码，我只是想知道是否有人可以帮助解释为什么mymean是最快的

#First remove all non numeric columns
numeric_clean_usnews <- cleaned_us_news[,-c((1:3),36)] #Removing any non numeric columns

#using apply and mean()
Average_1 <-apply(numeric_clean_usnews,2,mean,na.rm=T)
Average_1

#using apply and user created mean function
mymean <- function(cleaned_us){
  column_total = sum(cleaned_us,na.rm=T)
  column_length = sum(!is.na(cleaned_us))
  return (column_total/column_length)
}

Average_2 <- apply(numeric_clean_usnews,2,mymean)
Average_2


#using 2 for loops 
mymean2 <- function (cleaned_usnews){
  column_averages = numeric(dim(cleaned_us_news_only_numeric)[2])
for (column in 1:ncol(cleaned_us_news_only_numeric)){
  column_total = 0
  column_length = 0

  for (row in 1:nrow(cleaned_us_news_only_numeric)){

    if(!is.na(cleaned_us_news_only_numeric[row,column])){
      column_total = column_total + cleaned_us_news_only_numeric[row,column]
      column_length= column_length + 1
    }
  }
  column_averages[column]=column_total/column_length
}
  return (column_averages)
}
Average_3 <- mymean2(numeric_clean_usnews)

#首先删除所有非数字列
numeric_clean_usnews一个懒惰的答案：myMean比mean
快，因为sum
的开销比mean
小。与mean
函数相比，它对内容的检查更少。sum
和mean
最终都会运行编译的c循环，这些循环被称为向量化函数。任何时候你都可以使用矢量化函数，因为它们往往是最优化的for
循环在R中受到不好的评价。它们过去速度很慢，但在过去10年中速度有了显著提高。但是，它们仍然比矢量化函数慢得多。什么样的检查使求和比平均值快？您可以查看mean.default
。只需在控制台中键入不带括号的内容。.default
会导致另一种称为“方法分派”的速度减慢，这与根据特定对象的类找到适用于该对象的正确函数（面向对象编程术语中的方法）有关。Hadley的AdvancedR书籍和附带的网站在面向对象编程一章中讨论了方法调度。