提高RODBC-Postgres的写性能_R_Performance_Rodbc_Rpostgresql

提高RODBC-Postgres的写性能

r performance

提高RODBC-Postgres的写性能,r,performance,rodbc,rpostgresql,R,Performance,Rodbc,Rpostgresql,我最近开始使用RODBC连接到PostgreSQL as。我发现这两个包的读性能相似，但写性能却不一样。例如，使用RODBC（其中z是~6.1M行数据帧）：库（RODBC） con似乎没有立即的答案，所以我将发布一个笨拙的解决方法，以防它对任何人都有帮助 Sharpie是正确的--从复制是目前将数据输入Postgres的最快方式。根据他的建议，我拼凑了一个函数，它比RODBC:：sqlSave（）提供了显著的性能提升。例如，通过sqlSavevs69秒使用下面的函数编写110万行（24列）数据

我最近开始使用RODBC连接到PostgreSQL as。我发现这两个包的读性能相似，但写性能却不一样。例如，使用RODBC（其中z是~6.1M行数据帧）：

库（RODBC）
con似乎没有立即的答案，所以我将发布一个笨拙的解决方法，以防它对任何人都有帮助
Sharpie是正确的--从
复制是目前将数据输入Postgres的最快方式。根据他的建议，我拼凑了一个函数，它比RODBC:：sqlSave（）
提供了显著的性能提升。例如，通过sqlSave
vs69秒使用下面的函数编写110万行（24列）数据帧需要960秒（已用时间）。我没有预料到这一点，因为数据一次写入磁盘，然后再写入数据库
library(RODBC)
con <- odbcConnect("PostgreSQL90")

#create the table
createTab <- function(dat, datname) {

  #make an empty table, saving the trouble of making it by hand
  res <- sqlSave(con, dat[1, ], datname)
  res <- sqlQuery(con, paste("TRUNCATE TABLE",datname))

  #write the dataframe
  outfile = paste(datname, ".csv", sep = "")
  write.csv(dat, outfile)
  gc()   # don't know why, but memory is 
         # not released after writing large csv?

  # now copy the data into the table.  If this doesn't work,
  # be sure that postgres has read permissions for the path
  sqlQuery(con,  
  paste("COPY ", datname, " FROM '", 
    getwd(), "/", datname, 
    ".csv' WITH NULL AS 'NA' DELIMITER ',' CSV HEADER;", 
    sep=""))

  unlink(outfile)
}

odbcClose(con)

库（RODBC）
con在我有限的经验中，将大量数据塞进Postgres，从插入到切换到复制是可接受性能的必要条件。不确定是否可以打开连接，然后将csv文件写入其中，以避免额外写入。。。
library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="gisdb", user="postgres", password="...")
system.time(dbWriteTable(con, "ERASE222", z))

user  system elapsed 
467.57   56.62  668.29 

dbDisconnect(con)

library(RODBC)
con <- odbcConnect("PostgreSQL90")

#create the table
createTab <- function(dat, datname) {

  #make an empty table, saving the trouble of making it by hand
  res <- sqlSave(con, dat[1, ], datname)
  res <- sqlQuery(con, paste("TRUNCATE TABLE",datname))

  #write the dataframe
  outfile = paste(datname, ".csv", sep = "")
  write.csv(dat, outfile)
  gc()   # don't know why, but memory is 
         # not released after writing large csv?

  # now copy the data into the table.  If this doesn't work,
  # be sure that postgres has read permissions for the path
  sqlQuery(con,  
  paste("COPY ", datname, " FROM '", 
    getwd(), "/", datname, 
    ".csv' WITH NULL AS 'NA' DELIMITER ',' CSV HEADER;", 
    sep=""))

  unlink(outfile)
}

odbcClose(con)