Node.js 将内存中的文件下载并上载到Google Drive_Node.js_Axios_Stream_Google Drive Api

Node.js 将内存中的文件下载并上载到Google Drive

node.js stream google-drive-api

Node.js 将内存中的文件下载并上载到Google Drive,node.js,axios,stream,google-drive-api,Node.js,Axios,Stream,Google Drive Api,目标使用Google Drive API可恢复URL将文件下载并上载到Google Drive内存中挑战/问题我想在文件下载到内存（而不是文件系统）并随后上传到GoogleDrive时对其进行缓冲。 Google Drive API要求块的最小长度为256*1024（262144字节）进程应该从要上载的缓冲区传递一个块。如果区块出错，则该缓冲区区块最多重试3次。如果区块成功，则应清除缓冲区中的区块，并且该过程应继续，直到完成背景工作/研究（参考资料如下）我研究和测试过的大多数文章、示

目标

使用Google Drive API可恢复URL将文件下载并上载到Google Drive内存中

挑战/问题

我想在文件下载到内存（而不是文件系统）并随后上传到GoogleDrive时对其进行缓冲。 Google Drive API要求块的最小长度为

256*1024（262144字节）

进程应该从要上载的缓冲区传递一个块。如果区块出错，则该缓冲区区块最多重试3次。如果区块成功，则应清除缓冲区中的区块，并且该过程应继续，直到完成

背景工作/研究（参考资料如下）

我研究和测试过的大多数文章、示例和软件包都对流、管道和分块有一定的了解，但是使用

文件系统作为可读流的起点
我已经尝试了不同的方法，比如passthrough
和highWaterMark
等流，以及request
、gaxios
和got
等第三方库，它们内置了流/管道支持，但在流程的上传端没有任何效果
也就是说，我不知道如何构造管道
或分块
机制，无论是使用缓冲区
还是管道
，以正确地流向上传过程，直到完成，并以有效的方式处理进度和完成事件
问题
使用下面的代码，我如何适当地缓冲文件，并将内容长度和内容范围标题放入谷歌提供的URL，同时有足够的缓冲空间来处理3次重试

在处理背压或缓冲方面，利用和.uncork（）
是否是管理缓冲流的有效方法

有没有一种方法可以使用转换
流和高水印
和管道
有效地管理缓冲区？e、 g

下面是一个视觉模型和我试图完成的代码：
视觉示例
[====================]
File Length (20 MB)

[==========          ]
Download (10 MB)
       
      [======      ]
      Buffer (e.g. 6 MB, size 12 MB)

      [===]
      Upload Chunk (3 MB) => Error? Retry from Buffer (max 3 times)
                          => Success? Empty Buffer => Continue =>
      [===]
      Upload next Chunk (3 MB)

/* 
   Assume resumable_drive_url was already obtained from Google API
   with the proper access token, which already contains the 
   Content-Type and Content-Length in the session. 
*/

transfer(download_url, resumable_drive_url, file_type, file_length) {

    return new Promise((resolve, reject) => {

        let timeout = setTimeout(() => {
            reject(new Error("Transfer timed out."))
        }, 80000)


       // Question #1: Should the passthrough stream 
       // and .on events be declared here?

       const passthrough = new stream.PassThrough({
            highWaterMark: 256 * 1024
       })

       passthrough.on("error", (error) => {
            console.error(`Upload failed: ${error.message}`)
            reject(error.message)
       })

       passthrough.on("end", () => {
            clearTimeout(timeout)
            resolve(true)
       })

        
        // Download file
        axios({
            method: 'get',
            url: download_url,
            responseType: 'stream',
            maxRedirects: 1
        }).then(result => {
            
            // QUESTION #2: How do we buffer the file from here 
            // via axios.put to the resumable_url with the correct 
            // header information Content-Range and Content-Length?

            // CURIOSITY #1: Do we pipe from here 
            // to a passthrough stream that maintains a minimum buffer size?

            result.data.pipe(passthrough)
        }
        ).catch(error => {
            reject(error)
        })


    })
}

代码
[====================]
File Length (20 MB)

[==========          ]
Download (10 MB)
       
      [======      ]
      Buffer (e.g. 6 MB, size 12 MB)

      [===]
      Upload Chunk (3 MB) => Error? Retry from Buffer (max 3 times)
                          => Success? Empty Buffer => Continue =>
      [===]
      Upload next Chunk (3 MB)

/* 
   Assume resumable_drive_url was already obtained from Google API
   with the proper access token, which already contains the 
   Content-Type and Content-Length in the session. 
*/

transfer(download_url, resumable_drive_url, file_type, file_length) {

    return new Promise((resolve, reject) => {

        let timeout = setTimeout(() => {
            reject(new Error("Transfer timed out."))
        }, 80000)


       // Question #1: Should the passthrough stream 
       // and .on events be declared here?

       const passthrough = new stream.PassThrough({
            highWaterMark: 256 * 1024
       })

       passthrough.on("error", (error) => {
            console.error(`Upload failed: ${error.message}`)
            reject(error.message)
       })

       passthrough.on("end", () => {
            clearTimeout(timeout)
            resolve(true)
       })

        
        // Download file
        axios({
            method: 'get',
            url: download_url,
            responseType: 'stream',
            maxRedirects: 1
        }).then(result => {
            
            // QUESTION #2: How do we buffer the file from here 
            // via axios.put to the resumable_url with the correct 
            // header information Content-Range and Content-Length?

            // CURIOSITY #1: Do we pipe from here 
            // to a passthrough stream that maintains a minimum buffer size?

            result.data.pipe(passthrough)
        }
        ).catch(error => {
            reject(error)
        })


    })
}

参考资料
-（良好的分块机制，但臃肿；似乎有一种更有效的流管道方法）

-（概念上正确，但使用文件系统）
-（概念正确，但使用文件系统和自定义StreamFactory）
-（体面但似乎臃肿过时）
我相信你的目标和现状如下

您希望下载数据，并使用Axios with Node.js将下载的数据上载到Google Drive
对于上载数据，您希望通过从流中检索数据，使用具有多个块的可恢复上载进行上载
您的访问令牌可用于将数据上载到Google Drive
您已经知道要上载的数据的数据大小和mimeType

修改点：

在这种情况下，为了实现具有多个区块的可恢复上传，我想提出以下流程
从URL下载数据
为可恢复上载创建会话
从流中检索下载的数据并将其转换为缓冲区。

为此，我使用了stream.Transform
在本例中，我停止流并将数据上传到Google Drive。我不认为不停止流就可以实现这一点的方法
我认为这一部分可能是你问题2和3的答案


当缓冲区大小与声明的区块大小相同时，将缓冲区上载到Google Drive。

我认为这一部分可能是你问题3的答案


当上载发生错误时，将再次上载相同的缓冲区。在此示例脚本中，将运行3次重试。重试3次后，将发生错误。

我认为这一部分可能是你问题1的答案




当上述流程反映到脚本中时，它将变成如下所示
修改脚本：
请在函数main（）
中设置变量
结果:
当以23558108
的文件大小（这是一个示例数据）运行上述脚本时，在控制台中获得以下结果
Progress: from 0 to 10485759 for 23558108
Progress: from 10485760 to 20971519 for 23558108
Progress(last): from 20971520 to 23558107 for 23558108
{
  kind: 'drive#file',
  id: '###',
  name: 'sample filename',
  mimeType: '###'
}

注:

当您希望使用单个块实现可恢复的上载时，您可以在中看到示例脚本

参考资料：



首先，我不得不为我糟糕的英语技能道歉。我能问一下你的问题吗？1.在您的情况下，我能考虑您已经知道要下载的文件的文件大小吗？2.您是否已经拥有将数据上传到Google Drive的访问令牌？3.在您的目标中，下载的数据是否需要由多个区块上传？@Tanaike，是的，我知道文件大小。我通过一个访问令牌给谷歌内容长度
和内容类型
，谷歌会返回一个“可恢复的URL”，作为要上传到的会话服务器。是的，我在帖子中添加了一个可视示例。我们的想法是为区块操作留出足够的空间来处理错误，并在加载缓冲区时重试，如果成功，请清除缓冲区并继续，直到完成。感谢您的回复。从你的回答和更新的问题，我提出了一个答案。你能确认一下吗？如果这不是你所期望的方向，我很抱歉。@Tanaike我很荣幸从你这里提供的方法中获得如此详细的信息和见解。谢谢你的精力。我将执行此操作并向您汇报。?