Powershell ---------+ |3 |相对|导出Csv |导出Csv | 1000元素列表| 154.2秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4a |连接路径|转换为Csv | StreamWriter |否| 425.0秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4b |连接路径|转换为Csv | StreamWriter | 1000元素列表| 456.1秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4c |连接路径|字符串插值| StreamWriter |否| 302.5秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4d |连接路径|字符串插值| StreamWriter |否| 225.1秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4e |[IO.Path]::Combine()|字符串插值| StreamWriter | No | 78.0秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4f |字符串插值|字符串插值| StreamWriter |否| 77.7秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+
主要收获:Powershell ---------+ |3 |相对|导出Csv |导出Csv | 1000元素列表| 154.2秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4a |连接路径|转换为Csv | StreamWriter |否| 425.0秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4b |连接路径|转换为Csv | StreamWriter | 1000元素列表| 456.1秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4c |连接路径|字符串插值| StreamWriter |否| 302.5秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4d |连接路径|字符串插值| StreamWriter |否| 225.1秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4e |[IO.Path]::Combine()|字符串插值| StreamWriter | No | 78.0秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+ |4f |字符串插值|字符串插值| StreamWriter |否| 77.7秒| +--------+----------------------+----------------------+--------------+---------------------+-----------------+,powershell,csv,powershell-2.0,Powershell,Csv,Powershell 2.0,主要收获: 与导出Csv一起使用时,输出缓冲(1→ 2和1→ 3) 提供了巨大的性能改进 与StreamWriters一起使用时,输出缓冲(4a→ 4b)没有帮助,实际上会对性能造成小的影响 消除转换为Csv的(4a→ 4c)执行时间减少三分之一(153.6秒) 方法4a比缓冲的导出Csv方法慢得多,因为它引入了获取位置和连接路径的使用。要么这些cmdlet在幕后的处理比看上去要多得多,要么调用cmdlet通常比较慢(当然,如果调用了一百万次)。 将获取位置移动到每个对象的外部(4c→ 4d
- 与导出Csv一起使用时,输出缓冲(1→ 2和1→ 3) 提供了巨大的性能改进
- 与
s一起使用时,输出缓冲(4a→ 4b)没有帮助,实际上会对性能造成小的影响StreamWriter
- 消除转换为Csv的
(4a→ 4c)执行时间减少三分之一(153.6秒)
- 方法4a比缓冲的
方法慢得多,因为它引入了导出Csv
和获取位置
的使用。要么这些cmdlet在幕后的处理比看上去要多得多,要么调用cmdlet通常比较慢(当然,如果调用了一百万次)。连接路径
- 将
移动到每个对象的获取位置
(4c→ 4d)执行时间减少了四分之一(77.4秒)外部
- 使用
而不是[System.IO.Path]::Combine()
(4d→ 4e)执行时间减少了三分之二(147.1秒)连接路径
- 将
- 脚本优化既有趣又有教育意义李>
Group Object
需要将整个文件读取到内存中。对于大文件,您应该避免这样做。谢谢您,先生。对我的玩具档案来说,这简直是一种魅力。但是,实际的文件确实需要一些时间,所以我现在把这个打开。是的,它很慢。我正在制作一个改进版。现在测试一下。泰勒,如果性能是一个你可能想考虑的问题。不过,请注意,使用这种方法,您将需要自己处理很多事情,导入Csv
和导出Csv
通常会为您处理的事情。@AnsgarWiechers Ha,这就是我现在正在研究的解决方案。奇怪的是,只打开一个输出文件可以节省多少时间。我认为即使你坚持在文件中添加附件,差异也会很大。
Ticker Price Date
AAPL 1 2017-08-14
AAPL 2 2017-08-14
AAPL 3 2017-08-14
AAPL 4 2017-08-14
MSFT 5 2017-08-14
MSFT 6 2017-08-14
MSFT 7 2017-08-14
GOOG 8 2017-08-14
GOOG 9 2017-08-14
...
Ticker Price Date
AAPL 1 2017-08-13
AAPL 2 2017-08-13
AAPL 3 2017-08-13
AAPL 4 2017-08-13
MSFT 5 2017-08-13
MSFT 6 2017-08-13
MSFT 7 2017-08-13
GOOG 8 2017-08-13
GOOG 9 2017-08-13
...
Ticker Price Date
AAPL 1 2017-08-14
AAPL 2 2017-08-14
AAPL 3 2017-08-14
AAPL 4 2017-08-14
Ticker Price Date
MSFT 5 2017-08-14
MSFT 6 2017-08-14
MSFT 7 2017-08-14
Ticker Price Date
GOOG 8 2017-08-14
GOOG 9 2017-08-14
Ticker Price Date
AAPL 1 2017-08-13
AAPL 2 2017-08-13
AAPL 3 2017-08-13
AAPL 4 2017-08-13
Ticker Price Date
MSFT 5 2017-08-13
MSFT 6 2017-08-13
MSFT 7 2017-08-13
Ticker Price Date
GOOG 8 2017-08-13
GOOG 9 2017-08-13
Import-Csv file-2017-08-14.csv | Group-Object -Property "Ticker" | Foreach-Object {
$path = $_.Name + ".csv";
$_.Group | Export-Csv -Path $path -NoTypeInformation
}
Get-ChildItem -Filter '*.csv' -File -Force `
| Select-Object -ExpandProperty 'FullName' `
| Import-Csv -Delimiter "`t" `
| ForEach-Object -Process {
$outputFilePath = "out\{0}-{1}.csv" -f $_.Ticker, $_.Date;
$_ | Export-Csv -Path $outputFilePath -Append -NoTypeInformation;
};
$pendingRecordsByFilePath = @{};
$maxPendingRecordsPerFilePath = 1000;
Get-ChildItem -Filter '*.csv' -File -Force `
| Select-Object -ExpandProperty 'FullName' `
| Import-Csv -Delimiter "`t" `
| ForEach-Object -Process {
$outputFilePath = "out\{0}-{1}.csv" -f $_.Ticker, $_.Date;
$pendingRecords = $pendingRecordsByFilePath[$outputFilePath];
if ($pendingRecords -eq $null)
{
# This is the first time we're encountering this output file; create a new array
$pendingRecords = @();
}
elseif ($pendingRecords.Length -ge $maxPendingRecordsPerFilePath)
{
# Flush all pending records for this output file
$pendingRecords `
| Export-Csv -Path $outputFilePath -Append -NoTypeInformation;
$pendingRecords = @();
}
$pendingRecords += $_;
$pendingRecordsByFilePath[$outputFilePath] = $pendingRecords;
};
# No more input records to be read; flush all pending records for each output file
foreach ($outputFilePath in $pendingRecordsByFilePath.Keys)
{
$pendingRecordsByFilePath[$outputFilePath] `
| Export-Csv -Path $outputFilePath -Append -NoTypeInformation;
}
$pendingRecordsByFilePath = @{};
$maxPendingRecordsPerFilePath = 1000;
Get-ChildItem -Filter '*.csv' -File -Force `
| Select-Object -ExpandProperty 'FullName' `
| Import-Csv -Delimiter "`t" `
| ForEach-Object -Process {
$outputFilePath = "out\{0}-{1}.csv" -f $_.Ticker, $_.Date;
$pendingRecords = $pendingRecordsByFilePath[$outputFilePath];
if ($pendingRecords -eq $null)
{
# This is the first time we're encountering this output file; create a new list
$pendingRecords = New-Object `
-TypeName 'System.Collections.Generic.List[Object]' `
-ArgumentList (,$maxPendingRecordsPerFilePath);
$pendingRecordsByFilePath[$outputFilePath] = $pendingRecords;
}
elseif ($pendingRecords.Count -ge $maxPendingRecordsPerFilePath)
{
# Flush all pending records for this output file
$pendingRecords `
| Export-Csv -Path $outputFilePath -Append -NoTypeInformation;
$pendingRecords.Clear();
}
$pendingRecords.Add($_);
};
# No more input records to be read; flush all pending records for each output file
foreach ($outputFilePath in $pendingRecordsByFilePath.Keys)
{
$pendingRecordsByFilePath[$outputFilePath] `
| Export-Csv -Path $outputFilePath -Append -NoTypeInformation;
}
$truncateExistingOutputFiles = $true;
$outputFileWritersByPath = @{};
try
{
Get-ChildItem -Filter '*.csv' -File -Force `
| Select-Object -ExpandProperty 'FullName' `
| Import-Csv -Delimiter "`t" `
| ForEach-Object -Process {
$outputFilePath = Join-Path -Path (Get-Location) -ChildPath ('out\{0}-{1}.csv' -f $_.Ticker, $_.Date);
$outputFileWriter = $outputFileWritersByPath[$outputFilePath];
$outputLines = $_ | ConvertTo-Csv -NoTypeInformation;
if ($outputFileWriter -eq $null)
{
# This is the first time we're encountering this output file; create a new StreamWriter
$outputFileWriter = New-Object `
-TypeName 'System.IO.StreamWriter' `
-ArgumentList ($outputFilePath, -not $truncateExistingOutputFiles, [System.Text.Encoding]::ASCII);
$outputFileWritersByPath[$outputFilePath] = $outputFileWriter;
# Write the header line
$outputFileWriter.WriteLine($outputLines[0]);
}
# Write the data line
$outputFileWriter.WriteLine($outputLines[1]);
};
}
finally
{
foreach ($writer in $outputFileWritersByPath.Values)
{
$writer.Close();
}
}
$truncateExistingOutputFiles = $true;
$outputDirectoryPath = Join-Path -Path (Get-Location) -ChildPath 'out';
$outputFileWritersByPath = @{};
try
{
Get-ChildItem -Filter '*.csv' -File -Force `
| Select-Object -ExpandProperty 'FullName' `
| Import-Csv -Delimiter "`t" `
| ForEach-Object -Process {
$outputFileName = '{0}-{1}.csv' -f $_.Ticker, $_.Date;
$outputFilePath = [System.IO.Path]::Combine($outputDirectoryPath, $outputFileName);
$outputFileWriter = $outputFileWritersByPath[$outputFilePath];
if ($outputFileWriter -eq $null)
{
# This is the first time we're encountering this output file; create a new StreamWriter
$outputFileWriter = New-Object `
-TypeName 'System.IO.StreamWriter' `
-ArgumentList ($outputFilePath, -not $truncateExistingOutputFiles, [System.Text.Encoding]::ASCII);
$outputFileWritersByPath[$outputFilePath] = $outputFileWriter;
# Write the header line
$outputFileWriter.WriteLine('"Ticker","Price","Date"');
}
# Write the data line
$outputFileWriter.WriteLine("""$($_.Ticker)"",""$($_.Price)"",""$($_.Date)""");
};
}
finally
{
foreach ($writer in $outputFileWritersByPath.Values)
{
$writer.Close();
}
}
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| Method | Path handling | Line building | File writing | Output buffering | Execution time |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 1 | Relative | Export-Csv | Export-Csv | No | 2,178.5 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 2 | Relative | Export-Csv | Export-Csv | 1,000-element array | 222.9 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 3 | Relative | Export-Csv | Export-Csv | 1,000-element List | 154.2 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 4a | Join-Path | ConvertTo-Csv | StreamWriter | No | 425.0 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 4b | Join-Path | ConvertTo-Csv | StreamWriter | 1,000-element List | 456.1 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 4c | Join-Path | String interpolation | StreamWriter | No | 302.5 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 4d | Join-Path | String interpolation | StreamWriter | No | 225.1 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 4e | [IO.Path]::Combine() | String interpolation | StreamWriter | No | 78.0 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+
| 4f | String interpolation | String interpolation | StreamWriter | No | 77.7 seconds |
+--------+----------------------+----------------------+--------------+---------------------+-----------------+