如何在R中创建大量对象并将它们存储在单独的CSV中?
我今天一直在研究这个问题,到目前为止我发现的告诉我如何循环目录中现有的CSV并做一些事情,但我似乎找不到如何解决我的具体问题 我想做的是: 我将数据框拆分为若干部分 2对于每个部分,我希望将其写入CSV,并将其名称保存为data_I,其中I是循环的索引 3对于每个部分,从MatchIt运行匹配模型 4对于每个部分和匹配模型,获取匹配数据 5将匹配的数据保存为名为matched_data_i.csv的对象。再次,i是循环索引 6通过删除数据_i和匹配模型来完成循环 下面是一些不起作用的代码,但显示了我想要的位置:如何在R中创建大量对象并将它们存储在单独的CSV中?,r,csv,for-loop,model,R,Csv,For Loop,Model,我今天一直在研究这个问题,到目前为止我发现的告诉我如何循环目录中现有的CSV并做一些事情,但我似乎找不到如何解决我的具体问题 我想做的是: 我将数据框拆分为若干部分 2对于每个部分,我希望将其写入CSV,并将其名称保存为data_I,其中I是循环的索引 3对于每个部分,从MatchIt运行匹配模型 4对于每个部分和匹配模型,获取匹配数据 5将匹配的数据保存为名为matched_data_i.csv的对象。再次,i是循环索引 6通过删除数据_i和匹配模型来完成循环 下面是一些不起作用的代码,但显示
library(tidyverse)
library(MatchIt)
data("mtcars")
View(mtcars)
n <- 10
nr <- nrow(mtcars)
splitter <- split(mtcars, rep(1:ceiling(nr/n), each=n, length.out=nr))
for(i in splitter){
write.csv(splitter[i], file = paste0(data_i)) ## this is a part I need help on, how do i name each CSV according to its loop index?
### how do i name each object mod_match_[i] where i is the index of the loop?
mod_match_[i] = matchit(am ~ mpg + wt, method = "nearest", data = as.data.frame(splitter[i])) ##I think it is a data frame anyway but doesn't hurt to be sure since matchit falls over when exposed to tibbles (from experience)
matched_data_[i] = match.data(mod_match_[i]) ### again i don't know how to make the name of this object change depending on which "i" we're up to
write.csv(matched_data_[i], file = "matched_data_[i].csv") ## how can i save each one as a separate CSV with a name referring to the index?
## i want to remove the objects before repeating the loop
rm(mod_match_[i])
rm(matched_data_[i])
}
考虑将流程封装在定义的函数中,使用函数对象而无需命名或从环境中删除。另外,在用于拆分的列上使用paste或其非空格包装器paste0。以下包括两种替代的等效解决方案: 作用 借
非常感谢冻糕为我指明了正确的方向,尤其是用了浆糊。我真正需要解决的是,split似乎重命名了列表中每个DF的列,这破坏了代码,使其返回NULL 这是另一个解决方案,它更符合我在野外需要的代码,即不在像lalonde这样的小数据集上。希望它对将来的人有用
## packages
library(tidyverse)
library(MatchIt)
##data
data("lalonde")
## randomize the data because lalonde is sorted by treated so the mathcing will fail for some subsets
lalonde2 <- lalonde[sample(nrow(lalonde)),]
##set the size of each subset
n <- 30
nr <- nrow(lalonde2)
### make the subsets
splitter <- split(lalonde2, rep(1:ceiling(nr/n), each=n, length.out=nr))
## write them to file (with replaced names because split() changed them)
for(i in 1:length(splitter)){
names(splitter[[i]]) <- c("treat", "age", "educ", "black", "hispan", "married", "nodegree", "re74", "re75", "re78")
write.csv(splitter[[i]], file = paste0("data_", i, ".csv"))
}
## remove the big one
rm(splitter)
## for loop that runs through each of the saved files from earlier, runs a matching model and matches the data and writes it to a file all in one
require(stringr)
for (i in 1:ceiling(nr/n)){
file<- read.csv(str_c("data_",i,".csv"))
write.csv(match.data(matchit(treat ~ age + educ, method = "nearest", data = file, ratio = 1)), file = paste0("matched_data_", i, ".csv"))
### remove the data after each iteration
rm(file)
}
谢谢你。我实际上需要解决方案,以便在保存数据后能够删除数据的每个分割。这就是为什么我试着在一个循环中做它,然后调用lapplysplit。。。通过不指定任何对象,即,。未创建拆分列表并删除返回行。请参见编辑。这相当有效。有趣的是,匹配模型似乎在任何数据集的每个子集上都失败了,但这段代码很好
# ADD NEW GROUPING COLUMN
mtcars$grp <- rep(1:ceiling(nr/n), each=n, length.out=nr)
# RUN PROCESS TO RETURN NOTHING
lapply(split(mtcars, mtcars$grp), proc_match)
# ADD NEW GROUPING COLUMN
mtcars$grp <- rep(1:ceiling(nr/n), each=n, length.out=nr)
# RUN PROCESS TO RETURN NOTHING
by(mtcars, mtcars$grp, proc_match)
## packages
library(tidyverse)
library(MatchIt)
##data
data("lalonde")
## randomize the data because lalonde is sorted by treated so the mathcing will fail for some subsets
lalonde2 <- lalonde[sample(nrow(lalonde)),]
##set the size of each subset
n <- 30
nr <- nrow(lalonde2)
### make the subsets
splitter <- split(lalonde2, rep(1:ceiling(nr/n), each=n, length.out=nr))
## write them to file (with replaced names because split() changed them)
for(i in 1:length(splitter)){
names(splitter[[i]]) <- c("treat", "age", "educ", "black", "hispan", "married", "nodegree", "re74", "re75", "re78")
write.csv(splitter[[i]], file = paste0("data_", i, ".csv"))
}
## remove the big one
rm(splitter)
## for loop that runs through each of the saved files from earlier, runs a matching model and matches the data and writes it to a file all in one
require(stringr)
for (i in 1:ceiling(nr/n)){
file<- read.csv(str_c("data_",i,".csv"))
write.csv(match.data(matchit(treat ~ age + educ, method = "nearest", data = file, ratio = 1)), file = paste0("matched_data_", i, ".csv"))
### remove the data after each iteration
rm(file)
}