R 如何对指定的向量应用相关性,遍历文件列表?

R 如何对指定的向量应用相关性,遍历文件列表?,r,R,我需要获取一个数据帧列表,并将cor()应用于每个数据帧中相同的两列,返回一个相关值列表。以下是我目前的职能: corr <- function(directory, threshold = 0){ #reads directory of files file_list <- list.files(path = "C:/Users/jonah/Documents/R work/R Coursera Course/specdata")

我需要获取一个数据帧列表,并将cor()应用于每个数据帧中相同的两列,返回一个相关值列表。以下是我目前的职能:

 corr <- function(directory, threshold = 0){

 #reads directory of files

         file_list <- list.files(path = "C:/Users/jonah/Documents/R work/R Coursera Course/specdata")



 # takes file_list and makes each file into dataframe

         dflist <- lapply(file_list, read.csv)



 # returns list of files, na rows stripped

         nolist <- lapply(dflist, na.omit)

 # removes all with nrows < threshold

         abovelist <- nolist[sapply(nolist, function(x) nrow(x) > threshold)]

          

 # runs correlation of nitrate, sulfate on remaining
         correlations <- lapply(abovelist, cor(abovelist$sulfate, abovelist$nitrate))

 }

corr您可以使用
lappy
中的匿名函数引用对象,方法与
sapply
相同

试试这个:

corr <- function(directory, threshold = 0){
  file_list <- list.files(path = directory)
  dflist <- lapply(file_list, function(x) na.omit(read.csv(x)))
  abovelist <- dflist[sapply(dflist, nrow) > threshold]
  correlations <- lapply(abovelist, function(x) cor(x$sulfate, x$nitrate))
  return(correlations)
}

这只是一个提示,可以找到之前关于这些Coursera问题的很多讨论——我认为“PollutanMean”是关键词。虽然我认为这个问题不同于“污染指数”问题。谢谢你,这很有效。我非常感谢你的帮助。
corr("C:/Users/jonah/Documents/R work/R Coursera Course/specdata")