Powershell解压流
是否有一个内置cmdlet或其组合,允许我在下载每个区块时开始解压缩文件流?我有一个PowerShell脚本,需要下载一个大(10 GB)文件,我必须等到它现在完成后才开始扩展Powershell解压流,powershell,stream,Powershell,Stream,是否有一个内置cmdlet或其组合,允许我在下载每个区块时开始解压缩文件流?我有一个PowerShell脚本,需要下载一个大(10 GB)文件,我必须等到它现在完成后才开始扩展 $wc = New-Object net.webclient $wc.Downloadfile($appDataSnapshotUri, "%DataSnapshotFileName%.zip") # this can take some time Expand-Archive -Path "%DataSnapshot
$wc = New-Object net.webclient
$wc.Downloadfile($appDataSnapshotUri, "%DataSnapshotFileName%.zip") # this can take some time
Expand-Archive -Path "%DataSnapshotFileName%.zip" -DestinationPath Run # so can this
好的,事实证明,解压缩zip文件不需要完全下载,您可以压缩/解压缩流。Net中有一些内置的流压缩功能,但它不能用于zip存档。您可以使用SharpZipLib库来实现: 从下载.nupckg 将文件解压缩到任何文件夹。您需要lib/net45中的ICSharpCode.SharpZipLib.dll 下面是我对他们示例的简化翻译: 它将只提取第一个条目,您可以添加一个while循环,以使此示例正常工作 下面是一个带有while循环的片段,用于提取多个文件(在上面的示例中,将其放在
$zipEntry=$zipInputStream.getnextry()
之后):
编辑
这是我找到的工作
Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"
$outFolder = "unzip"
$wc = [System.Net.WebClient]::new()
$zipStream = $wc.OpenRead("https://github.com/Esri/file-geodatabase-api/raw/master/FileGDB_API_1.5/FileGDB_API_1_5_VS2015.zip")
$zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)
$zipEntry = $zipInputStream.GetNextEntry()
while($zipEntry) {
if (-Not($zipEntry.IsDirectory)) {
$fileName = $zipEntry.Name
$buffer = New-Object byte[] 4096
$filePath = "$pwd\$outFolder\$fileName"
$parentPath = "$filePath\.."
Write-Host $parentPath
if (-Not (Test-Path $parentPath)) {
New-Item -ItemType Directory $parentPath
}
$sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")
[ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)
$sw.Close()
}
$zipEntry = $zipInputStream.GetNextEntry()
}
好的,事实证明,解压缩zip文件不需要完全下载,您可以压缩/解压缩流。Net中有一些内置的流压缩功能,但它不能用于zip存档。您可以使用SharpZipLib库来实现: 从下载.nupckg 将文件解压缩到任何文件夹。您需要lib/net45中的ICSharpCode.SharpZipLib.dll 下面是我对他们示例的简化翻译: 它将只提取第一个条目,您可以添加一个while循环,以使此示例正常工作 下面是一个带有while循环的片段,用于提取多个文件(在上面的示例中,将其放在
$zipEntry=$zipInputStream.getnextry()
之后):
编辑
这是我找到的工作
Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"
$outFolder = "unzip"
$wc = [System.Net.WebClient]::new()
$zipStream = $wc.OpenRead("https://github.com/Esri/file-geodatabase-api/raw/master/FileGDB_API_1.5/FileGDB_API_1_5_VS2015.zip")
$zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)
$zipEntry = $zipInputStream.GetNextEntry()
while($zipEntry) {
if (-Not($zipEntry.IsDirectory)) {
$fileName = $zipEntry.Name
$buffer = New-Object byte[] 4096
$filePath = "$pwd\$outFolder\$fileName"
$parentPath = "$filePath\.."
Write-Host $parentPath
if (-Not (Test-Path $parentPath)) {
New-Item -ItemType Directory $parentPath
}
$sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")
[ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)
$sw.Close()
}
$zipEntry = $zipInputStream.GetNextEntry()
}
要扩展Mike Twc的答案,请使用脚本在有流和无流的情况下执行,并比较所需时间:
$url = "yoururlhere"
function UnzipStream () {
Write-Host "unzipping via stream"
$stopwatch1 = [system.diagnostics.stopwatch]::StartNew()
Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"
$outFolder = "unzip-stream"
$wc = [System.Net.WebClient]::new()
$zipStream = $wc.OpenRead($url)
$zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)
$zipEntry = $zipInputStream.GetNextEntry()
while($zipEntry) {
if (-Not($zipEntry.IsDirectory)) {
$fileName = $zipEntry.Name
$buffer = New-Object byte[] 4096
$filePath = "$pwd\$outFolder\$fileName"
$parentPath = "$filePath\.."
Write-Host $parentPath
if (-Not (Test-Path $parentPath)) {
New-Item -ItemType Directory $parentPath
}
$sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")
[ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)
$sw.Close()
}
$zipEntry = $zipInputStream.GetNextEntry()
}
$stopwatch1.Stop()
Write-Host "extraction took $($stopWatch1.ElapsedMilliseconds) millis with stream"
}
function UnzipWithoutStream() {
Write-Host "Extracting without stream"
$stopwatch2 = [system.diagnostics.stopwatch]::StartNew()
$outFolder2 = "unzip-normal"
$wc2 = New-Object System.Net.WebClient
$wc2.DownloadFile($url, "$pwd\download.zip")
$of2 = New-Item -ItemType Directory $outFolder2
Expand-Archive -Path "download.zip" -DestinationPath $of2.FullName
$stopwatch2.Stop()
Write-Host "extraction took $($stopWatch2.ElapsedMilliseconds) millis without stream"
}
UnzipStream
UnzipWithoutStream
要扩展Mike Twc的答案,请使用脚本在有流和无流的情况下执行,并比较所需时间:
$url = "yoururlhere"
function UnzipStream () {
Write-Host "unzipping via stream"
$stopwatch1 = [system.diagnostics.stopwatch]::StartNew()
Add-Type -Path ".\ICSharpCode.SharpZipLib.dll"
$outFolder = "unzip-stream"
$wc = [System.Net.WebClient]::new()
$zipStream = $wc.OpenRead($url)
$zipInputStream = [ICSharpCode.SharpZipLib.Zip.ZipInputStream]::New($zipStream)
$zipEntry = $zipInputStream.GetNextEntry()
while($zipEntry) {
if (-Not($zipEntry.IsDirectory)) {
$fileName = $zipEntry.Name
$buffer = New-Object byte[] 4096
$filePath = "$pwd\$outFolder\$fileName"
$parentPath = "$filePath\.."
Write-Host $parentPath
if (-Not (Test-Path $parentPath)) {
New-Item -ItemType Directory $parentPath
}
$sw = [System.IO.File]::Create("$pwd\$outFolder\$fileName")
[ICSharpCode.SharpZipLib.Core.StreamUtils]::Copy($zipInputStream, $sw, $buffer)
$sw.Close()
}
$zipEntry = $zipInputStream.GetNextEntry()
}
$stopwatch1.Stop()
Write-Host "extraction took $($stopWatch1.ElapsedMilliseconds) millis with stream"
}
function UnzipWithoutStream() {
Write-Host "Extracting without stream"
$stopwatch2 = [system.diagnostics.stopwatch]::StartNew()
$outFolder2 = "unzip-normal"
$wc2 = New-Object System.Net.WebClient
$wc2.DownloadFile($url, "$pwd\download.zip")
$of2 = New-Item -ItemType Directory $outFolder2
Expand-Archive -Path "download.zip" -DestinationPath $of2.FullName
$stopwatch2.Stop()
Write-Host "extraction took $($stopWatch2.ElapsedMilliseconds) millis without stream"
}
UnzipStream
UnzipWithoutStream
我不认为.zip压缩文件格式是这样工作的。(部分下载.zip文件是一个损坏的.zip文件。)@Bill_Stewart是100%正确的。索引位于称为中心目录的zip的末尾。它是metadeta所在的位置,但最重要的是每个文件的起始点索引所在的位置。@ArcSet,@Bill_Stewart是的,我也读过这篇文章,但我也能在
nodejs
中做类似的事情,即使这有点骇客。所以,也许可以尝试一种更好地支持流式解压缩的压缩格式?该文件是在网络驱动器上还是在ftp或sharepoint上?你能分享这个node js脚本吗?@MikeTwc当然,我会把它记下来,但现在就看-我使用这个库。我不认为.zip压缩文件格式是这样工作的。(部分下载.zip文件是一个损坏的.zip文件。)@Bill_Stewart是100%正确的。索引位于称为中心目录的zip的末尾。它是metadeta所在的位置,但最重要的是每个文件的起始点索引所在的位置。@ArcSet,@Bill_Stewart是的,我也读过这篇文章,但我也能在nodejs
中做类似的事情,即使这有点骇客。所以,也许可以尝试一种更好地支持流式解压缩的压缩格式?该文件是在网络驱动器上还是在ftp或sharepoint上?你能分享一下node-js脚本吗?@MikeTwc当然,我会给你讲讲,但现在就看——我用的这个库太棒了!我不得不稍微调整一下才能让它工作,但这就是我的想法。我将在一篇文章中分享我的调整!我不得不稍微调整一下才能让它工作,但这就是我的想法。我将在编辑中共享我的调整