Powershell解压流

Powershell解压流,powershell,stream,Powershell,Stream,是否有一个内置cmdlet或其组合,允许我在下载每个区块时开始解压缩文件流?我有一个PowerShell脚本,需要下载一个大(10 GB)文件,我必须等到它现在完成后才开始扩展 $wc = New-Object net.webclient $wc.Downloadfile($appDataSnapshotUri, "%DataSnapshotFileName%.zip") # this can take some time Expand-Archive -Path "%DataSnapshot

是否有一个内置cmdlet或其组合,允许我在下载每个区块时开始解压缩文件流?我有一个PowerShell脚本,需要下载一个大(10 GB)文件,我必须等到它现在完成后才开始扩展

$wc = New-Object net.webclient
$wc.Downloadfile($appDataSnapshotUri, "%DataSnapshotFileName%.zip") # this can take some time

Expand-Archive -Path "%DataSnapshotFileName%.zip" -DestinationPath Run # so can this

好的,事实证明,解压缩zip文件不需要完全下载,您可以压缩/解压缩流。Net中有一些内置的流压缩功能,但它不能用于zip存档。您可以使用SharpZipLib库来实现:

从下载.nupckg 将文件解压缩到任何文件夹。您需要lib/net45中的ICSharpCode.SharpZipLib.dll

下面是我对他们示例的简化翻译:

它将只提取第一个条目,您可以添加一个while循环,以使此示例正常工作

下面是一个带有while循环的片段,用于提取多个文件(在上面的示例中,将其放在
$zipEntry=$zipInputStream.getnextry()
之后):

编辑

这是我找到的工作

Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"

$outFolder = "unzip"

$wc = [System.Net.WebClient]::new()

$zipStream = $wc.OpenRead("https://github.com/Esri/file-geodatabase-api/raw/master/FileGDB_API_1.5/FileGDB_API_1_5_VS2015.zip")

$zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)

$zipEntry = $zipInputStream.GetNextEntry()

while($zipEntry) {

if (-Not($zipEntry.IsDirectory)) { 
  $fileName = $zipEntry.Name

  $buffer = New-Object byte[] 4096

  $filePath = "$pwd\$outFolder\$fileName"
  $parentPath = "$filePath\.."
  Write-Host $parentPath

  if (-Not (Test-Path $parentPath)) {
      New-Item -ItemType Directory $parentPath
  }

  $sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")

  [ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)
  $sw.Close()

}

$zipEntry = $zipInputStream.GetNextEntry()

}

好的,事实证明,解压缩zip文件不需要完全下载,您可以压缩/解压缩流。Net中有一些内置的流压缩功能,但它不能用于zip存档。您可以使用SharpZipLib库来实现:

从下载.nupckg 将文件解压缩到任何文件夹。您需要lib/net45中的ICSharpCode.SharpZipLib.dll

下面是我对他们示例的简化翻译:

它将只提取第一个条目,您可以添加一个while循环,以使此示例正常工作

下面是一个带有while循环的片段,用于提取多个文件(在上面的示例中,将其放在
$zipEntry=$zipInputStream.getnextry()
之后):

编辑

这是我找到的工作

Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"

$outFolder = "unzip"

$wc = [System.Net.WebClient]::new()

$zipStream = $wc.OpenRead("https://github.com/Esri/file-geodatabase-api/raw/master/FileGDB_API_1.5/FileGDB_API_1_5_VS2015.zip")

$zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)

$zipEntry = $zipInputStream.GetNextEntry()

while($zipEntry) {

if (-Not($zipEntry.IsDirectory)) { 
  $fileName = $zipEntry.Name

  $buffer = New-Object byte[] 4096

  $filePath = "$pwd\$outFolder\$fileName"
  $parentPath = "$filePath\.."
  Write-Host $parentPath

  if (-Not (Test-Path $parentPath)) {
      New-Item -ItemType Directory $parentPath
  }

  $sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")

  [ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)
  $sw.Close()

}

$zipEntry = $zipInputStream.GetNextEntry()

}

要扩展Mike Twc的答案,请使用脚本在有流和无流的情况下执行,并比较所需时间:

$url = "yoururlhere"

function UnzipStream () {
    Write-Host "unzipping via stream"

    $stopwatch1 =  [system.diagnostics.stopwatch]::StartNew()


    Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"

    $outFolder = "unzip-stream"

    $wc = [System.Net.WebClient]::new()

    $zipStream = $wc.OpenRead($url)

    $zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)

    $zipEntry = $zipInputStream.GetNextEntry()

    while($zipEntry) {

    if (-Not($zipEntry.IsDirectory)) { 
    $fileName = $zipEntry.Name

    $buffer = New-Object byte[] 4096

    $filePath = "$pwd\$outFolder\$fileName"
    $parentPath = "$filePath\.."
    Write-Host $parentPath

    if (-Not (Test-Path $parentPath)) {
        New-Item -ItemType Directory $parentPath
    }

    $sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")

    [ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)

    $sw.Close()
    }

    $zipEntry = $zipInputStream.GetNextEntry()

    }

    $stopwatch1.Stop()

    Write-Host "extraction took $($stopWatch1.ElapsedMilliseconds) millis with stream"
}

function UnzipWithoutStream() {

    Write-Host "Extracting without stream"

    $stopwatch2 = [system.diagnostics.stopwatch]::StartNew()
    $outFolder2 = "unzip-normal"

    $wc2 = New-Object System.Net.WebClient
    $wc2.DownloadFile($url, "$pwd\download.zip")

    $of2 = New-Item -ItemType Directory $outFolder2

    Expand-Archive -Path "download.zip" -DestinationPath $of2.FullName

    $stopwatch2.Stop()

    Write-Host "extraction took $($stopWatch2.ElapsedMilliseconds) millis without stream"
}

UnzipStream
UnzipWithoutStream

要扩展Mike Twc的答案,请使用脚本在有流和无流的情况下执行,并比较所需时间:

$url = "yoururlhere"

function UnzipStream () {
    Write-Host "unzipping via stream"

    $stopwatch1 =  [system.diagnostics.stopwatch]::StartNew()


    Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"

    $outFolder = "unzip-stream"

    $wc = [System.Net.WebClient]::new()

    $zipStream = $wc.OpenRead($url)

    $zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)

    $zipEntry = $zipInputStream.GetNextEntry()

    while($zipEntry) {

    if (-Not($zipEntry.IsDirectory)) { 
    $fileName = $zipEntry.Name

    $buffer = New-Object byte[] 4096

    $filePath = "$pwd\$outFolder\$fileName"
    $parentPath = "$filePath\.."
    Write-Host $parentPath

    if (-Not (Test-Path $parentPath)) {
        New-Item -ItemType Directory $parentPath
    }

    $sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")

    [ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)

    $sw.Close()
    }

    $zipEntry = $zipInputStream.GetNextEntry()

    }

    $stopwatch1.Stop()

    Write-Host "extraction took $($stopWatch1.ElapsedMilliseconds) millis with stream"
}

function UnzipWithoutStream() {

    Write-Host "Extracting without stream"

    $stopwatch2 = [system.diagnostics.stopwatch]::StartNew()
    $outFolder2 = "unzip-normal"

    $wc2 = New-Object System.Net.WebClient
    $wc2.DownloadFile($url, "$pwd\download.zip")

    $of2 = New-Item -ItemType Directory $outFolder2

    Expand-Archive -Path "download.zip" -DestinationPath $of2.FullName

    $stopwatch2.Stop()

    Write-Host "extraction took $($stopWatch2.ElapsedMilliseconds) millis without stream"
}

UnzipStream
UnzipWithoutStream

我不认为.zip压缩文件格式是这样工作的。(部分下载.zip文件是一个损坏的.zip文件。)@Bill_Stewart是100%正确的。索引位于称为中心目录的zip的末尾。它是metadeta所在的位置,但最重要的是每个文件的起始点索引所在的位置。@ArcSet,@Bill_Stewart是的,我也读过这篇文章,但我也能在
nodejs
中做类似的事情,即使这有点骇客。所以,也许可以尝试一种更好地支持流式解压缩的压缩格式?该文件是在网络驱动器上还是在ftp或sharepoint上?你能分享这个node js脚本吗?@MikeTwc当然,我会把它记下来,但现在就看-我使用这个库。我不认为.zip压缩文件格式是这样工作的。(部分下载.zip文件是一个损坏的.zip文件。)@Bill_Stewart是100%正确的。索引位于称为中心目录的zip的末尾。它是metadeta所在的位置,但最重要的是每个文件的起始点索引所在的位置。@ArcSet,@Bill_Stewart是的,我也读过这篇文章,但我也能在
nodejs
中做类似的事情,即使这有点骇客。所以,也许可以尝试一种更好地支持流式解压缩的压缩格式?该文件是在网络驱动器上还是在ftp或sharepoint上?你能分享一下node-js脚本吗?@MikeTwc当然,我会给你讲讲,但现在就看——我用的这个库太棒了!我不得不稍微调整一下才能让它工作,但这就是我的想法。我将在一篇文章中分享我的调整!我不得不稍微调整一下才能让它工作,但这就是我的想法。我将在编辑中共享我的调整