R 如何对指定的向量应用相关性,遍历文件列表?
我需要获取一个数据帧列表,并将cor()应用于每个数据帧中相同的两列,返回一个相关值列表。以下是我目前的职能:R 如何对指定的向量应用相关性,遍历文件列表?,r,R,我需要获取一个数据帧列表,并将cor()应用于每个数据帧中相同的两列,返回一个相关值列表。以下是我目前的职能: corr <- function(directory, threshold = 0){ #reads directory of files file_list <- list.files(path = "C:/Users/jonah/Documents/R work/R Coursera Course/specdata")
corr <- function(directory, threshold = 0){
#reads directory of files
file_list <- list.files(path = "C:/Users/jonah/Documents/R work/R Coursera Course/specdata")
# takes file_list and makes each file into dataframe
dflist <- lapply(file_list, read.csv)
# returns list of files, na rows stripped
nolist <- lapply(dflist, na.omit)
# removes all with nrows < threshold
abovelist <- nolist[sapply(nolist, function(x) nrow(x) > threshold)]
# runs correlation of nitrate, sulfate on remaining
correlations <- lapply(abovelist, cor(abovelist$sulfate, abovelist$nitrate))
}
corr您可以使用lappy
中的匿名函数引用对象,方法与sapply
相同
试试这个:
corr <- function(directory, threshold = 0){
file_list <- list.files(path = directory)
dflist <- lapply(file_list, function(x) na.omit(read.csv(x)))
abovelist <- dflist[sapply(dflist, nrow) > threshold]
correlations <- lapply(abovelist, function(x) cor(x$sulfate, x$nitrate))
return(correlations)
}
这只是一个提示,可以找到之前关于这些Coursera问题的很多讨论——我认为“PollutanMean”是关键词。虽然我认为这个问题不同于“污染指数”问题。谢谢你,这很有效。我非常感谢你的帮助。
corr("C:/Users/jonah/Documents/R work/R Coursera Course/specdata")