Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/postgresql/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Node.js 使用NodeJS在Postgres数据库中插入大量行_Node.js_Postgresql - Fatal编程技术网

Node.js 使用NodeJS在Postgres数据库中插入大量行

Node.js 使用NodeJS在Postgres数据库中插入大量行,node.js,postgresql,Node.js,Postgresql,我正在尝试使用NodeJs向Postgres表中插入超过100万行 问题是,当我启动脚本时,内存不断增加,直到达到1.5 GB的RAM,然后出现错误: 致命错误:调用和重试上次分配失败-进程内存不足 结果总是一样的——大约7000行插入,而不是100万行 这是密码 var pg = require('pg'); var fs = require('fs'); var config = require('./config.js'); var PgClient = new pg.Client(

我正在尝试使用NodeJs向Postgres表中插入超过100万行 问题是,当我启动脚本时,内存不断增加,直到达到1.5 GB的RAM,然后出现错误: 致命错误:调用和重试上次分配失败-进程内存不足

结果总是一样的——大约7000行插入,而不是100万行

这是密码

var pg = require('pg');
var fs = require('fs');
var config = require('./config.js');



var PgClient = new pg.Client(config.pg);
PgClient.connect();

var lineReader = require('readline').createInterface({
      input: require('fs').createReadStream('resources/database.csv') //file contains over 1 million lines
    });
var n=0;




lineReader.on('line', function(line) {
      n++;
      var insert={"firstname":"John","lastname":"Conor"};

      //No matter what data we insert, the point is that the number of inserted rows much less than it should be 
      PgClient.query('INSERT INTO HUMANS (firstname,lastname) values ($1,$2)', [insert.firstname,insert.lastname]);

});

lineReader.on('close',function() {
     console.log('end '+n); 
});

所以我解决了这个问题。有一个PgClient.queryQueue,它的处理速度远远低于文件读取速度。当读取大文件时,队列溢出。 这里的解决方案是,我们应该更改lineReader。在('line',cb)部分,每当队列中有很多元素时,我们都会暂停lineReader

lineReader.on('line', function(line) {
      n++;
      var insert={"firstname":"John","lastname":"Conor"};
      PgClient.query('INSERT INTO HUMANS (firstname,lastname) values ($1,$2)', [insert.firstname,insert.lastname],function (err,result){
          if (err) console.log(err);
          if (PgClient.queryQueue.length>15000) {
              lineReader.pause(); 
          }
          else lineReader.resume(); 
      });
});

我按照vitaly-t的建议使用pg promise。这段代码运行得非常快

const fs = require('fs');
const pgp = require('pg-promise')();
const config = require('./config.js');

// Db connection
const db = pgp(config.pg);

// Transform a lot of inserts into one
function Inserts(template, data) {
    if (!(this instanceof Inserts)) {
        return new Inserts(template, data);
    }
    this._rawType = true;
    this.toPostgres = () => {
        return data.map(d => '(' + pgp.as.format(template, d) + ')').join();
    };
}

// insert Template
function Insert() {
      return {
          firstname:   null,
          lastname:    null,
          birthdate:     null,
          phone:    null,
          email:   null,
          city: null,
          district:    null,
          location: null,
          street: null
      };
};
const lineReader = require('readline').createInterface({
      input: require('fs').createReadStream('resources/database.csv')
    });


let n = 0;
const InsertArray = [];

lineReader.on('line', function(line) {   
      var insert = new Insert();
      n ++;   
      var InsertValues=line.split(',');
      if (InsertValues[0]!=='"Firstname"'){ //skip first line
          let i = 0;
          for (let prop in insert){
              insert[prop] = (InsertValues[i]=='')?insert[prop]:InsertValues[i];
              i++;
          }
          InsertArray.push(insert);
          if (n == 10000){
              lineReader.pause();
              // convert insert array into one insert
              const values = new Inserts('${firstname}, ${lastname},${birthdate},${phone},${email},${city},${district},${location},${street}', InsertArray);
              db.none('INSERT INTO users (firstname, lastname,birthdate,phone,email,city,district,location,street) VALUES $1', values)
                .then(data => {
                    n = 0;
                    InsertArray=[];
                    lineReader.resume();
                })
                .catch(error => {
                    console.log(error);
                });
          }
      }
});


lineReader.on('close',function() {
     console.log('end '+n); 
     //last insert
     if (n > 0) {
         const values = new Inserts('${firstname}, ${lastname},${birthdate},${phone},${email},${city},${district},${location},${street}', InsertArray);
         db.none('INSERT INTO users (firstname, lastname,birthdate,phone,email,city,district,location,street) VALUES $1', values)
            .then(data => {
                console.log('Last');
            })
            .catch(error => {
                console.log(error);
            });
     }
});

您是否尝试过在收到行后暂停,在调用查询的回调后恢复?我认为排队的查询太多,这可能会耗尽进程内存;在查询和lineReader.resume()之前;在查询之后,但看起来好像没有用。同样的错误,请考虑改为进行批量插入。按行插入太贵了。@m3n1at您的意思是您向
PgClient.query()
调用添加了一个回调,其中有您调用的
lineReader.resume()
?@mscdex是的,我这样做了。问题相同-致命错误:调用\和\重试\上次分配失败-处理内存不足这是一个糟糕的解决方案,而正确的解决方案是微不足道的-从文件中分批读取约1000-10000行插入,并将每次读取作为批插入。此外,您还需要连接插入-请参阅。最佳示例:。我已经更新了代码示例,以符合最新的pg promise v6.5.0