R 在函数中按年份读取特定数据文件_R

R 在函数中按年份读取特定数据文件

R 在函数中按年份读取特定数据文件,r,R,我正在尝试编写一个函数，它将读取特定的数据文件我有如下几点： files <- list.files("folder/destinationfolder", recursive = TRUE, pattern = ".csv") files <- file.path("C:/", "folder", files) "C:/folder/folder/destinationfolder/2005/file_2005.csv" "C:/folder/folder/destinatio

我正在尝试编写一个函数，它将读取特定的数据文件

我有如下几点：

files <- list.files("folder/destinationfolder", recursive = TRUE, pattern = ".csv")
files <- file.path("C:/", "folder", files)

"C:/folder/folder/destinationfolder/2005/file_2005.csv"
"C:/folder/folder/destinationfolder/2006/file_2006.csv"
"C:/folder/folder/destinationfolder/2007/file_2007.csv"
"C:/folder/folder/destinationfolder/2008/file_2008.csv"
"C:/folder/folder/destinationfolder/2009/file_2009.csv"

readdata <- function(fn){
  dt_temp <- fread(fn, sep=",")
  return(dt_temp)
}

mylist <- lapply(files, readdata)

df <- plyr::ldply(mylist, data.frame)

接下来，我可以通过执行以下操作读取这些文件：

files <- list.files("folder/destinationfolder", recursive = TRUE, pattern = ".csv")
files <- file.path("C:/", "folder", files)

"C:/folder/folder/destinationfolder/2005/file_2005.csv"
"C:/folder/folder/destinationfolder/2006/file_2006.csv"
"C:/folder/folder/destinationfolder/2007/file_2007.csv"
"C:/folder/folder/destinationfolder/2008/file_2008.csv"
"C:/folder/folder/destinationfolder/2009/file_2009.csv"

readdata <- function(fn){
  dt_temp <- fread(fn, sep=",")
  return(dt_temp)
}

mylist <- lapply(files, readdata)

df <- plyr::ldply(mylist, data.frame)

这将产生以下输出：

"2005" "2006" "2007" "2008" "2009"

因此，我想读入

和

，处理我的数据，然后读入

，然后读入

并处理这些数据等

编辑：

我想我需要做的是在

readdata

函数中添加一行，该行将

grep

文件路径

“C:/folder/folder/destinationfolder/2009/file_2009.csv”

中的

year

替换为

函数中的year
，并替换为year-1
。因此，在readdata
函数中，可能如下所示：
readdata <- function(fn){
# Grep the file path and replace the year with the year in the funciton
# Grep the file path again and replace the year with `t-1`
  dt_temp <- fread(fn, sep=",") # read in these two data files
  return(dt_temp)
}

readdata如果我正确理解了这个问题，下面的函数fucn
将加载两年并返回命名列表中的两个数据帧。名单成员的姓名为各自的年份
我还简化了函数extract\u years
，这样它就不需要包stringr
，只需要基R
extract_years <- function(ex_years){
  sub("^.*_([[:digit:]]+)\\..*$", "\\1", ex_years)
}

fucn <- function(years){
  year1 <- as.integer(years)
  year2 <- year1 + 1L
  file1 <- grep(year1, files, value = TRUE)
  file2 <- grep(year2, files, value = TRUE)
  dt_temp1 <- fread(file1, sep = ",")
  dt_temp2 <- fread(file2, sep = ",")
  res <- list(dt_temp1, dt_temp1)
  names(res) <- c(year1, year2)
  res
}

yrs <- extract_years(files)

extract\u years为什么不lappy（文件，数据.tabe:：fread，sep=“，”）
？这会导致记忆问题吗？您需要csv文件的全部内容吗？如果必须按原样保存数据，请查看Sys.sleep（）以帮助解决内存问题。我有10年的数据要处理，我不需要所有这些年，我的原始代码在所有年中都会加载，稍后我会遇到内存问题，因为此数据列表占用了相当多的空间。我需要的是在两年内加载一次。比如说2005年和2004年，处理这些年，然后继续下一个年份。