Javascript 什么'；在node.js转换流中处理背压的正确方法是什么？简介_Javascript_Node.js_Zlib

Javascript 什么'；在node.js转换流中处理背压的正确方法是什么？简介

javascript node.js

Javascript 什么'；在node.js转换流中处理背压的正确方法是什么？简介,javascript,node.js,zlib,Javascript,Node.js,Zlib,这是我在编写node.js服务器端的第一次冒险。一直都是这样到目前为止很有趣，但我在理解正确的方法上有些困难实现与node.js流相关的内容问题出于测试和学习的目的，我正在处理大文件内容是zlib压缩的。压缩的内容是二进制数据，每个数据包的长度为38字节。我正在尝试创建一个结果文件除了有一个每1024个38字节数据包的未压缩31字节报头原始文件内容（已解压缩）结果文件内容正如你所看到的，这是一个翻译问题。意思是，我将一些源流作为输入，然后稍微转换它输入到某个输出流中。因

这是我在编写node.js服务器端的第一次冒险。一直都是这样到目前为止很有趣，但我在理解正确的方法上有些困难实现与node.js流相关的内容

问题出于测试和学习的目的，我正在处理大文件内容是zlib压缩的。压缩的内容是二进制数据，每个数据包的长度为38字节。我正在尝试创建一个结果文件除了有一个每1024个38字节数据包的未压缩31字节报头

原始文件内容（已解压缩）结果文件内容正如你所看到的，这是一个翻译问题。意思是，我将一些源流作为输入，然后稍微转换它输入到某个输出流中。因此，实施

该类仅尝试完成以下任务：

将流作为输入

zlib对数据块进行膨胀以计算数据包的数量，将其中的1024个放在一起，zlib放气在标题前加上前缀

通过管道传递新的结果块

this.push（块）

用例将类似于：

var fs = require('fs');
var me = require('./me'); // Where my Transform stream code sits
var inp = fs.createReadStream('depth_1000000');
var out = fs.createWriteStream('depth_1000000.out');
inp.pipe(me.createMyTranslate()).pipe(out);

问题: 假设Transform是这个用例的一个好选择，我似乎是这样的遇到可能的背压问题。我对

this.push（chunk）

在

\u转换中

不断返回

false

。为什么会这样，怎么会这样

处理这样的事情

我认为

Transform

适用于此，但我会将充气作为管道中的一个单独步骤来执行

下面是一个未经测试的快速示例：

var zlib        = require('zlib');
var stream      = require('stream');
var transformer = new stream.Transform();

// Properties used to keep internal state of transformer.
transformer._buffers    = [];
transformer._inputSize  = 0;
transformer._targetSize = 1024 * 38;

// Dump one 'output packet'
transformer._dump       = function(done) {
  // concatenate buffers and convert to binary string
  var buffer = Buffer.concat(this._buffers).toString('binary');

  // Take first 1024 packets.
  var packetBuffer = buffer.substring(0, this._targetSize);

  // Keep the rest and reset counter.
  this._buffers   = [ new Buffer(buffer.substring(this._targetSize)) ];
  this._inputSize = this._buffers[0].length;

  // output header
  this.push('HELLO WORLD');

  // output compressed packet buffer
  zlib.deflate(packetBuffer, function(err, compressed) {
    // TODO: handle `err`
    this.push(compressed);
    if (done) {
      done();
    }
  }.bind(this));
};

// Main transformer logic: buffer chunks and dump them once the
// target size has been met.
transformer._transform  = function(chunk, encoding, done) {
  this._buffers.push(chunk);
  this._inputSize += chunk.length;

  if (this._inputSize >= this._targetSize) {
    this._dump(done);
  } else {
    done();
  }
};

// Flush any remaining buffers.
transformer._flush = function() {
  this._dump();
};

// Example:
var fs = require('fs');
fs.createReadStream('depth_1000000')
  .pipe(zlib.createInflate())
  .pipe(transformer)
  .pipe(fs.createWriteStream('depth_1000000.out'));

如果要写入的流（在本例中为文件输出流）缓冲的数据过多，

push

将返回false。因为您正在向磁盘写入数据，所以这是有意义的：您处理数据的速度比写入数据的速度快

当

out

的缓冲区已满时，转换流将无法推送，并开始缓冲数据本身。如果该缓冲区应该填满，则

inp

将开始填满。事情应该是这样的。管道流处理数据的速度只有链中最慢的链路能够处理数据的速度（一旦缓冲区已满）。

2013年的这个问题是我所能找到的关于如何处理“背压”的所有问题创建节点变换流时

从node 7.10.0和文档中我收集了什么如果

push

返回false，则在执行

\u read

之前不应推送任何其他内容打电话来

转换文档没有提到

\u read

，只是提到了基本转换类实现它（并且_write）。我发现有关

push

返回false的信息以及在文档中调用的

\u read

我在Transform back pressure上找到的唯一其他权威评论只提到这是一个问题，在节点文件顶部的注释中

以下是该评论中关于背压的部分：

// This way, back-pressure is actually determined by the reading side,
// since _read has to be called to start processing a new chunk.  However,
// a pathological inflate type of transform can cause excessive buffering
// here.  For example, imagine a stream where every byte of input is
// interpreted as an integer from 0-255, and then results in that many
// bytes of output.  Writing the 4 bytes {ff,ff,ff,ff} would result in
// 1kb of data being output.  In this case, you could write a very small
// amount of input, and end up with a very large amount of output.  In
// such a pathological inflating mechanism, there'd be no way to tell
// the system to stop doing the transform.  A single 4MB write could
// cause the system to run out of memory.
//
// However, even in such a pathological case, only a single written chunk
// would be consumed, and then the rest would wait (un-transformed) until
// the results of the previous transformed chunk were consumed.

解决方案示例下面是我拼凑的解决方案，用于处理变换流中的背压我敢肯定这是有效的。（我没有写任何真正的测试，这需要写入可写流以控制背压。）

这是一个基本的线变换，需要像线变换一样工作，但不需要演示如何处理“背压”

有用的调试提示最初写这篇文章的时候，我没有意识到

\u read

可以在之前被称为
\u transform
返回，所以我没有实现
这个获取以下错误： Error: no writecb in Transform class at afterTransform (_stream_transform.js:71:33) at TransformState.afterTransform (_stream_transform.js:54:12) at LineTransform._continueTransform (/userdata/mjl/Projects/personal/srt-shift/dist/textfilelines.js:44:13) at LineTransform._transform (/userdata/mjl/Projects/personal/srt-shift/dist/textfilelines.js:46:21) at LineTransform.Transform._read (_stream_transform.js:167:10) at LineTransform._read (/userdata/mjl/Projects/personal/srt-shift/dist/textfilelines.js:56:15) at LineTransform.Transform._write (_stream_transform.js:155:12) at doWrite (_stream_writable.js:331:12) at writeOrBuffer (_stream_writable.js:317:5) at LineTransform.Writable.write (_stream_writable.js:243:11) 查看节点实现时，我意识到这个错误意味着回调给定给\u transform被多次调用。没有多少信息要发现这个错误，所以我想我应该在这里包括我发现的。最近遇到了类似的问题，需要处理膨胀的转换流中的背压-处理push（）返回false的秘密是注册并处理流上的'drain' 事件 _transform(data, enc, callback) { const continueTransforming = () => { // ... do some work / parse the data, keep state of where we're at etc if(!this.push(event)) this._readableState.pipes.once('drain', continueTransforming); // will get called again when the reader can consume more data if(allDone) callback(); } continueTransforming() } 注意，当我们深入到内部时，pipes 甚至可以是可读的数组，但它在…pipe（transform）.pipe（… 如果节点社区的人能够建议一种处理.push（）返回false的“正确”方法，那就太好了。我最终按照Ledion的例子创建了一个实用程序转换类，该类可以帮助处理背压。该实用程序添加了一个名为addData的异步方法，实现转换可以等待 'use strict'; const { Transform } = require('stream'); /** * The BackPressureTransform class adds a utility method addData which * allows for pushing data to the Readable, while honoring back-pressure. */ class BackPressureTransform extends Transform { constructor(...args) { super(...args); } /** * Asynchronously add a chunk of data to the output, honoring back-pressure. * * @param {String} data * The chunk of data to add to the output. * * @returns {Promise<void>} * A Promise resolving after the data has been added. */ async addData(data) { // if .push() returns false, it means that the readable buffer is full // when this occurs, we must wait for the internal readable to emit // the 'drain' event, signalling the readable is ready for more data if (!this.push(data)) { await new Promise((resolve, reject) => { const errorHandler = error => { this.emit('error', error); reject(); }; const boundErrorHandler = errorHandler.bind(this); this._readableState.pipes.on('error', boundErrorHandler); this._readableState.pipes.once('drain', () => { this._readableState.pipes.removeListener('error', boundErrorHandler); resolve(); }); }); } } } module.exports = { BackPressureTransform }; “严格使用”； const{Transform}=require（'stream'）； /** *BackPressureTransform类添加了一个实用程序方法addData，该方法 *允许将数据推送到可读位置，同时承受背压。 */ 类BackPressureTransform扩展了转换{ 构造函数（…参数）{ 超级（…args）； } /** *异步地向输出中添加一块数据，以承受背压。 * *@param{String}数据 *要添加到输出的数据块。 * *@returns{Promise} *添加数据后的承诺解析。 */ 异步添加数据（数据）{ //如果.push（）返回false，则表示可读缓冲区已满 //当这种情况发生时，我们必须等待内部可读文件发出 //“drain”事件，表示可读文件已准备好接收更多数据如果（！this.push（数据））{ 等待新的承诺（（决定，拒绝）=>{ const errorHandler=错误=>{ 这个.emit（'error'，error）；拒绝（）； }; const boundErrorHandler=errorH const fs = require('fs'); let inStrm = fs.createReadStream("testdata/largefile.txt", { encoding: "utf8" }); let lineStrm = new LineTransform({ encoding: "utf8", decodeStrings: false }); inStrm.pipe(lineStrm).pipe(process.stdout); Error: no writecb in Transform class at afterTransform (_stream_transform.js:71:33) at TransformState.afterTransform (_stream_transform.js:54:12) at LineTransform._continueTransform (/userdata/mjl/Projects/personal/srt-shift/dist/textfilelines.js:44:13) at LineTransform._transform (/userdata/mjl/Projects/personal/srt-shift/dist/textfilelines.js:46:21) at LineTransform.Transform._read (_stream_transform.js:167:10) at LineTransform._read (/userdata/mjl/Projects/personal/srt-shift/dist/textfilelines.js:56:15) at LineTransform.Transform._write (_stream_transform.js:155:12) at doWrite (_stream_writable.js:331:12) at writeOrBuffer (_stream_writable.js:317:5) at LineTransform.Writable.write (_stream_writable.js:243:11) _transform(data, enc, callback) { const continueTransforming = () => { // ... do some work / parse the data, keep state of where we're at etc if(!this.push(event)) this._readableState.pipes.once('drain', continueTransforming); // will get called again when the reader can consume more data if(allDone) callback(); } continueTransforming() } 'use strict'; const { Transform } = require('stream'); /** * The BackPressureTransform class adds a utility method addData which * allows for pushing data to the Readable, while honoring back-pressure. */ class BackPressureTransform extends Transform { constructor(...args) { super(...args); } /** * Asynchronously add a chunk of data to the output, honoring back-pressure. * * @param {String} data * The chunk of data to add to the output. * * @returns {Promise<void>} * A Promise resolving after the data has been added. */ async addData(data) { // if .push() returns false, it means that the readable buffer is full // when this occurs, we must wait for the internal readable to emit // the 'drain' event, signalling the readable is ready for more data if (!this.push(data)) { await new Promise((resolve, reject) => { const errorHandler = error => { this.emit('error', error); reject(); }; const boundErrorHandler = errorHandler.bind(this); this._readableState.pipes.on('error', boundErrorHandler); this._readableState.pipes.once('drain', () => { this._readableState.pipes.removeListener('error', boundErrorHandler); resolve(); }); }); } } } module.exports = { BackPressureTransform }; 'use strict'; const { BackPressureTransform } = require('./back-pressure-transform'); /** * The Formatter class accepts the transformed row to be added to the output file. * The class provides generic support for formatting the result file. */ class Formatter extends BackPressureTransform { constructor() { super({ encoding: 'utf8', readableObjectMode: false, writableObjectMode: true }); this.anyObjectsWritten = false; } /** * Called when the data pipeline is complete. * * @param {Function} callback * The function which is called when final processing is complete. * * @returns {Promise<void>} * A Promise resolving after the flush completes. */ async _flush(callback) { // if any object is added, close the surrounding array if (this.anyObjectsWritten) { await this.addData('\n]'); } callback(null); } /** * Given the transformed row from the ETL, format it to the desired layout. * * @param {Object} sourceRow * The transformed row from the ETL. * * @param {String} encoding * Ignored in object mode. * * @param {Function} callback * The callback function which is called when the formatting is complete. * * @returns {Promise<void>} * A Promise resolving after the row is transformed. */ async _transform(sourceRow, encoding, callback) { // before the first object is added, surround the data as an array // between each object, add a comma separator await this.addData(this.anyObjectsWritten ? ',\n' : '[\n'); // update state this.anyObjectsWritten = true; // add the object to the output const parsed = JSON.stringify(sourceRow, null, 2).split('\n'); for (const [index, row] of parsed.entries()) { // prepend the row with 2 additional spaces since we're inside a larger array await this.addData(` ${row}`); // add line breaks except for the last row if (index < parsed.length - 1) { await this.addData('\n'); } } callback(null); } } module.exports = { Formatter }; _transform(buf, enc, callback) { // prepend any unused data from the prior chunk. if (this.prev) { buf = Buffer.concat([ this.prev, buf ]); this.prev = null; } // will keep transforming until buf runs low on data. if (buf.length < this.requiredData) { this.prev = buf; return callback(); } var result = // do something with data... var nextbuf = buf.slice(this.requiredData); if (this.push(result)) { // Continue transforming this chunk this._transform(nextbuf, enc, callback); } else { // Node is warning us to slow down (applying "backpressure") // Temporarily override _read request to continue the transform this._read = function() { delete this._read; this._transform(nextbuf, enc, callback); }; } } // a transform stream is a readable/writable stream where you do // something with the data. Sometimes it's called a "filter", // but that's not a great name for it, since that implies a thing where // some bits pass through, and others are simply ignored. (That would // be a valid example of a transform, of course.) // // While the output is causally related to the input, it's not a // necessarily symmetric or synchronous transformation. For example, // a zlib stream might take multiple plain-text writes(), and then // emit a single compressed chunk some time in the future. // // Here's how this works: // // The Transform stream has all the aspects of the readable and writable // stream classes. When you write(chunk), that calls _write(chunk,cb) // internally, and returns false if there's a lot of pending writes // buffered up. When you call read(), that calls _read(n) until // there's enough pending readable data buffered up. // // In a transform stream, the written data is placed in a buffer. When // _read(n) is called, it transforms the queued up data, calling the // buffered _write cb's as it consumes chunks. If consuming a single // written chunk would result in multiple output chunks, then the first // outputted bit calls the readcb, and subsequent chunks just go into // the read buffer, and will cause it to emit 'readable' if necessary. // // This way, back-pressure is actually determined by the reading side, // since _read has to be called to start processing a new chunk. However, // a pathological inflate type of transform can cause excessive buffering // here. For example, imagine a stream where every byte of input is // interpreted as an integer from 0-255, and then results in that many // bytes of output. Writing the 4 bytes {ff,ff,ff,ff} would result in // 1kb of data being output. In this case, you could write a very small // amount of input, and end up with a very large amount of output. In // such a pathological inflating mechanism, there'd be no way to tell // the system to stop doing the transform. A single 4MB write could // cause the system to run out of memory. // // However, even in such a pathological case, only a single written chunk // would be consumed, and then the rest would wait (un-transformed) until // the results of the previous transformed chunk were consumed.