将.csv文件与R合并_R_Plyr - Fatal编程技术网

将.csv文件与R合并

将.csv文件与R合并,r,plyr,R,Plyr,我有3个文件，其中包含3个变量：日期、ID和价格。我想按日期合并它们，因此如果我当前的文件之一是： date ID Price 01/01/10 A 1 01/02/10 A 1.02 01/02/10 A 0.99 ... ... 我想得到一个合并文件，它看起来像下面ID为a、B和C的文件（价格为Pr）：请注意，对于某些日期，没有价格，因此在这种情况下是NA 我目前的方法可行，但我觉得有点笨拙 setwd('~where you put the file

我有3个文件，其中包含3个变量：日期、ID和价格。我想按日期合并它们，因此如果我当前的文件之一是：

date      ID  Price
01/01/10   A   1
01/02/10   A   1.02
01/02/10   A   0.99
...
...

我想得到一个合并文件，它看起来像下面ID为a、B和C的文件（价格为Pr）：

请注意，对于某些日期，没有价格，因此在这种情况下是NA

我目前的方法可行，但我觉得有点笨拙

setwd('~where you put the files')
library(plyr)
listnames = list.files(pattern='.csv')
pp1 = ldply(listnames,read.csv,header=T) #put all the files in a data.frame

names(pp1)=c('date','ID','price')
pp1$date = as.Date(pp1$date,format='%m/%d/%Y')

# Reshape data frame so it gets organized by date
pp1=reshape(pp1,timevar='ID',idvar='date',direction='wide')

你能想出更好的办法吗

看起来像是

Reduce（）

的作业：

#将文件读入单个列表，从每个列表中删除不需要的第二列。
dataDir我无法访问这些文件，我在公司防火墙后面。一旦构建了data.frame，我将使用cast方法
    res = cast(pp1,date~ID,value="Price",mean)

转到一个注释——链接文件“a1.csv”
包含多个逗号分隔的行，没有数据。我用手把它们删除了，而不是在那里把答案中的R代码弄脏。实际上，我认为你用restrape
做的是一个很好的选择。reduce在速度方面表现如何？@PaulHiemstra：我猜不太好（因为它可能会为每个合并操作创建一个新的data.frame）。我真的不知道，但我要说的是，如果问题中涉及速度，我就不会建议Reduce。使用Reduce的有趣选择。现在R没有内置这种函数式编程方法吗。
# Read the files in to a single list, removing unwanted second column from each.
dataDir <- "example"
fNames <- dir(dataDir)
dataList <- lapply(file.path(dataDir, fNames),
                   function(X) {read.csv(X, header=TRUE)[-2]})

# Merge them                   
out <- Reduce(function(x,y) merge(x,y, by=1, all=TRUE), dataList)

# Construct column names
names(out)[-1] <- paste("Pr.", toupper(sub("1.csv", "", fNames)), sep="")
out
#       date Pr.A Pr.B Pr.C
# 1 1/1/2010 1.00   NA   NA
# 2 1/2/2010 1.02 1.20   NA
# 3 1/3/2010 0.99 1.30    1
# 4 1/4/2010   NA 1.23    2
# 5 1/5/2010   NA   NA    3

    res = cast(pp1,date~ID,value="Price",mean)