Javascript 错误:使用createReadStream时无法创建长度超过0x3fffffe7个字符的字符串
我正在解析非常大的CSV文件~37gbs。我正在使用fs.createReadStream和。我将它们分成5000行,然后将它们插入mongo db。然而,即使在注释掉mongo部分时,也会发生此错误 下面是解析文件的函数:Javascript 错误:使用createReadStream时无法创建长度超过0x3fffffe7个字符的字符串,javascript,fs,node-csv-parse,Javascript,Fs,Node Csv Parse,我正在解析非常大的CSV文件~37gbs。我正在使用fs.createReadStream和。我将它们分成5000行,然后将它们插入mongo db。然而,即使在注释掉mongo部分时,也会发生此错误 下面是解析文件的函数: function parseCsv(fileName: string, db: Db): Promise<any[]> { let parsedData: any[] = []; let counter = 0; return new P
function parseCsv(fileName: string, db: Db): Promise<any[]> {
let parsedData: any[] = [];
let counter = 0;
return new Promise((resolve, reject) => {
const stream = fs.createReadStream(fileName)
.pipe(csvParser())
.on('data', async (row) => {
const data = parseData(row);
parsedData.push(data);
if (parsedData.length > 5000) {
stream.pause();
// insert to mongo
counter++;
console.log('counter - ', counter, parsedData[0].personfirstname, parsedData[23].personfirstname);
parsedData = [];
// try {
// await db.collection('people').insertMany(parsedData, { ordered: false });
// parsedData = [];
// }
// catch (e) {
// console.log('error happened', e, parsedData.length);
// process.exit();
// }
stream.resume();
}
})
.on('error', (error) => {
console.error('There was an error reading the csv file', error);
})
.on('end', () => {
console.log('CSV file successfully processed');
resolve()
});
});
}
它将解析到大约200万行,然后挂起。最后,在挂了一整夜之后,我早上检查了一下,发现了以下错误:
buffer.js:580
if (encoding === 'utf-8') return buf.utf8Slice(start, end);
^
Error: Cannot create a string longer than 0x3fffffe7 characters
at stringSlice (buffer.js:580:44)
at Buffer.toString (buffer.js:643:10)
at CsvParser.parseValue (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:175:19)
at CsvParser.parseCell (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:86:17)
at CsvParser.parseLine (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:142:24)
at CsvParser._flush (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:196:10)
at CsvParser.prefinish (_stream_transform.js:140:10)
at CsvParser.emit (events.js:200:13)
at prefinish (_stream_writable.js:633:14)
at finishMaybe (_stream_writable.js:641:5) {
code: 'ERR_STRING_TOO_LONG'
}
createReadStream不应该确保不会发生这种情况吗?每行中有415列。有没有可能一行太大了?它总是停在同一个地方,所以这似乎是可能的。因为文件太大了,我没办法把它们翻出来。如果是这样,我如何检测到这一点,并跳过这一行或以不同的方式处理它
buffer.js:580
if (encoding === 'utf-8') return buf.utf8Slice(start, end);
^
Error: Cannot create a string longer than 0x3fffffe7 characters
at stringSlice (buffer.js:580:44)
at Buffer.toString (buffer.js:643:10)
at CsvParser.parseValue (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:175:19)
at CsvParser.parseCell (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:86:17)
at CsvParser.parseLine (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:142:24)
at CsvParser._flush (C:\js_scripts\csv-worker\node_modules\csv-parser\index.js:196:10)
at CsvParser.prefinish (_stream_transform.js:140:10)
at CsvParser.emit (events.js:200:13)
at prefinish (_stream_writable.js:633:14)
at finishMaybe (_stream_writable.js:641:5) {
code: 'ERR_STRING_TOO_LONG'
}