C# 如何在内存中读取csv文件以进行快速处理?
我尝试从csv文件导入数据,并使用System.IO.StreamReader方法读取内容。但是,此方法仅返回csv文件中包含的行数。请告诉我,这个方法是否正确使用C# 如何在内存中读取csv文件以进行快速处理?,c#,windows,powershell,shell,csv,C#,Windows,Powershell,Shell,Csv,我尝试从csv文件导入数据,并使用System.IO.StreamReader方法读取内容。但是,此方法仅返回csv文件中包含的行数。请告诉我,这个方法是否正确使用 function Update-ArchiWithImportData{ # Import the contents of the Extract_AGRe_TWS_ALL_20200925-01.csv and Create ElementCsv object $ElementCsv=Get-
function Update-ArchiWithImportData{
# Import the contents of the Extract_AGRe_TWS_ALL_20200925-01.csv and Create ElementCsv object
$ElementCsv=Get-Content "$env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-xxxx.csv"
$startTime = get-date
Write-Verbose "Reading $ElementCsv..."
$Reader= New-Object System.IO.StreamReader -Argument $ElementCsv
while ( ($read=$Reader.ReadLine()) -ne $null) {
#Loop through all the record in the CSV file
$NewModifiedElement= Measure-Command{ ForEach($Entry in $read){
if ($Entry."Script or expected file(s)" -ilike 'technical') {
$Entry.Jobstream=$Entry.Jobstream.trimStart('PAXCL')
}else {
# Get the name of jobSet without extension .ksh ou .bat
$Entry.Jobstream=$Entry."Script or expected file(s)"
# Write-Host $Entry.Jobstream
# Write-Host $Entry.Jobstream.length
$pos_last_point = $Entry.Jobstream.LastIndexOf(".")
#Write-Host $pos_last_point
$Entry.Jobstream = $Entry.Jobstream.Substring(0,$pos_last_point).trimStart('P')
}
$Entry
}
}
# Export Extract_AGRe_TWS_ALL_20200925-01.csv in new Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv file
$NewModifiedElement | Export-Csv "$env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv" -NoTypeInformation -Encoding UTF8
#End of the generation of the csv $env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv file, start of the jobstream grep search in the $env:USERPROFILE\Desktop\Archi\elements.csv file and the csv X file.
Write-Host "End of the generation of the csv $env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv file"
$GroupPath=$NewModifiedElement
$stream = New-Object IO.StreamWriter($GroupPath,$true)
$stream.WriteLine($read)
$stream.Close()
Write-Host -Object '✔' -ForegroundColor green
}
$Reader.ReadToEnd()
"Elapsed time: $((get-date)-$startTime)"
$Reader.Close()
$ReadPS.Dispose()
# Import the files
$ElementDifference= Import-csv $env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv | Select-Object -skip 1
$ElementInLocalHostArchi=Import-csv $env:USERPROFILE\Desktop\Archi\elements.csv | Select-Object -skip 1
$Output=@()
Measure-Command{
foreach ($row in $ElementDifference) {
foreach ($Line in $ElementInLocalHostArchi) {
if ($Line."Name" -contains $row."Jobstream") {
$Output +=$Line
Write-Host " $Output is found"
Continue
}elseif ($Line."Name" -notcontains $row."Jobstream") {
$Line."Name"=$row."Jobstream"
$Line."Documentation"=$row."Jobstream Description"
#$Line."ID"
#$Line."Type"
#Jobstream;Jobstream Description;Op num;Job;Script or expected file(s);Server;user;location;Job Description;FIELD10
$Output +=$Line
Write-Host " $Output not found and was insert in new line"
continue
}
}
}
}
$Output | Export-Csv "$env:USERPROFILE\Desktop\Archi\ElementChange.csv" -NoTypeInformation -Encoding UTF8
$content1 = $NewModifiedElement
$content2 = $ElementInLocalHostArchi
$minCount=[Math]::Min($content1.Count,$content2.Count)
$comparedLines = Compare-Object $content1 $content2 -IncludeEqual:$IncludeEqual -ExcludeDifferent:$ExcludeDifferent -SyncWindow 1 |
Group-Object { $_.InputObject.ReadCount } | Sort-Object Name
$comparedLines | ForEach-Object {
$curr=$_
switch ($_.Group[0].SideIndicator){
"==" { $right=$left=$curr.Group[0].InputObject;break}
"=>" {
$right,$left = $curr.Group[0].InputObject,$curr.Group[1].InputObject
if ($curr.Count -eq 1 -and [int]$curr.Name -gt $minCount){
$left="N/A"
}
break
}
"<=" {
$right,$left = $curr.Group[1].InputObject,$curr.Group[0].InputObject
if ($curr.Count -eq 1 -and [int]$curr.Name -gt $minCount){
$right="N/A"
}
break
}
}
New-Object PSObject -Property @{
Line = $_.Name
($ReferenceObject | Split-Path -Leaf) = $left
($DifferenceObject | Split-Path -Leaf) = $right
}
} | Sort-Object {[int]$_.Line}
# Import the files
$Difference= Import-csv $env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv -Header Jobstream | Select-Object "Jobstream"
$Reference=Import-csv $env:USERPROFILE\Desktop\Archi\elements.csv -Header Name | Select-Object "Name"
# Get the list of properties
$props1 = $Difference | Get-Member -MemberType NoteProperty | Select-Object -expand Name | Sort-Object | ForEach-Object {"$_"}
$props2 = $Reference | Get-Member -MemberType NoteProperty | Select-Object -expand Name | Sort-Object | ForEach-Object {"$_"}
if(Compare-Object $props1 $props2) {
# Check that properties match
throw "Properties are not the same! [$props1] [$props2]"
} else {
# Pass properties list to Compare-Object
"Checking $props1"
Compare-Object $Difference $Reference -Property $props1
}
}
Update-ArchiWithImportData # Call the function
Read-Host -Prompt “Press Enter to exit”
函数更新ArchiWithImportData{
#导入Extract_AGRe_TWS_ALL_20200925-01.csv的内容并创建ElementCsv对象
$ElementCsv=获取内容“$env:USERPROFILE\Desktop\Archi\Extract\u AGRe\u TWS\u ALL\u 20200925-xxxx.csv”
$startTime=获取日期
写详细的“读取$ElementCsv…”
$Reader=新对象System.IO.StreamReader-参数$ElementCsv
而(($read=$Reader.ReadLine())-ne$null){
#循环浏览CSV文件中的所有记录
$NewModifiedElement=Measure命令{ForEach($read中的条目){
如果($Entry.“脚本或预期文件”-我喜欢“technical”){
$Entry.Jobstream=$Entry.Jobstream.trimStart('PAXCL'))
}否则{
#获取作业集的名称,不带扩展名.ksh ou.bat
$Entry.Jobstream=$Entry。“脚本或预期文件”
#写入主机$Entry.Jobstream
#写入主机$Entry.Jobstream.length
$pos_last_point=$Entry.Jobstream.LastIndexOf(“.”)
#写入主机$pos\u最后一点
$Entry.Jobstream=$Entry.Jobstream.Substring(0,$pos\u last\u point)。trimStart('P'))
}
$Entry
}
}
#在新桌面\Archi\Extract\u AGRe\u TWS\u ALL\u 20200925-02.csv文件中导出Extract\u AGRe\u TWS\u ALL\u 20200925-02.csv
$NewModifiedElement |导出Csv“$env:USERPROFILE\Desktop\Archi\Extract\u AGRe\u TWS\u ALL\u 20200925-02.Csv”-非类型信息-编码UTF8
#csv$env:USERPROFILE\Desktop\Archi\Extract\u AGRe\u TWS\u ALL\u 20200925-02.csv文件生成结束,在$env:USERPROFILE\Desktop\Archi\elements.csv文件和csv X文件中开始jobstream grep搜索。
写入主机“csv$env:USERPROFILE\Desktop\Archi\Extract\u AGRe\u TWS\u ALL\u 20200925-02.csv文件的生成结束”
$GroupPath=$NewModifiedElement
$stream=新对象IO.StreamWriter($GroupPath,$true)
$stream.WriteLine($read)
$stream.Close()
写入主机-对象'✔' -前底色绿
}
$Reader.ReadToEnd()
“运行时间:$((获取日期)-$startTime)”
$Reader.Close()
$ReadPS.Dispose()
#导入文件
$ElementDifference=导入csv$env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-02.csv |选择对象-跳过1
$ElementInLocalHostArchi=导入csv$env:USERPROFILE\Desktop\Archi\elements.csv |选择对象-跳过1
$Output=@()
测量命令{
foreach($ElementDifference中的行){
foreach($ElementInLocalHostArchi中的行){
如果($Line.“Name”-包含$row.“Jobstream”){
$Output+=$Line
写入主机“$找到输出”
继续
}elseif($Line.“Name”-不包含$row.“作业流”){
$Line.“Name”=$row.“作业流”
$Line.“文档”=$row.“作业流描述”
#$Line.“ID”
#$Line.“类型”
#作业流;作业流描述;操作编号;作业;脚本或预期文件;服务器;用户;位置;作业描述;字段10
$Output+=$Line
写入主机“$未找到输出,已插入新行”
持续
}
}
}
}
$Output |导出Csv“$env:USERPROFILE\Desktop\Archi\ElementChange.Csv”-NoTypeInformation-编码UTF8
$content1=$NewModifiedElement
$content2=$ElementInLocalHostArchi
$minCount=[Math]::Min($content1.Count,$content2.Count)
$comparedLines=比较对象$content1$content2-IncludeEqual:$IncludeEqual-ExcludeDifferent:$ExcludeDifferent-SyncWindow 1|
组对象{$\ InputObject.ReadCount}排序对象名称
$comparedLines | ForEach对象{
$curr=$_
开关($\ 0.Group[0].SideIndicator){
“==”{$right=$left=$curr.Group[0]。InputObject;break}
"=>" {
$right,$left=$curr.Group[0]。InputObject,$curr.Group[1]。InputObject
如果($curr.Count-eq 1-和[int]$curr.Name-gt$minCount){
$left=“不适用”
}
打破
}
“我不完全清楚这些台词的作用:
$ElementCsv = Get-Content "$env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-xxxx.csv"
...
$Reader = New-Object System.IO.StreamReader -Argument $ElementCsv
但它几乎肯定没有达到你期望的效果
默认情况下,Get Content
返回一个字符串数组-文件中每行一个字符串,因此您正在将一个字符串数组传递给StreamReader
的构造函数,PowerShell正在“有益地”尝试查找构造函数重载(即其中一个:)它采用与$ElementCsv
变量匹配的参数
因为$ElementCsv
是一个数组,所以它试图找到一个构造函数,该构造函数使用的参数数与数组中的项目数相同(即309905),而且,毫不奇怪,没有找到一个
这个例子可能更清楚:
# write a csv file
Set-Content -Path "c:\temp\myfile.csv" -Value @"
row1_firstname,row1_lastname,row1_address
row2_firstname,row2_lastname,row2_address
row3_firstname,row3_lastname,row3_address
row4_firstname,row4_lastname,row4_address
row5_firstname,row5_lastname,row5_address
row6_firstname,row6_lastname,row6_address
"@;
# get-content reads an array of strings - one entry per line in the file
$lines = Get-Content -Path "c:\temp\myfile.csv";
# it's definitely an array
write-host $lines.GetType().FullName
# System.Object[]
# and it's got 6 entries
write-host $lines.Length
# 6
# and these are the entries...
$lines | % { write-host ("[" + $_ + "]") };
# [row1_firstname,row1_lastname,row1_address]
# [row2_firstname,row2_lastname,row2_address]
# [row3_firstname,row3_lastname,row3_address]
# [row4_firstname,row4_lastname,row4_address]
# [row5_firstname,row5_lastname,row5_address]
# [row6_firstname,row6_lastname,row6_address]
# powershell will do some magic to try to find a constructor with 6 parameters,
# but there isn't one
$reader = new-object System.IO.StreamReader -Argument $lines;
# New-Object: Cannot find an overload for "StreamReader" and the argument count: "6".
您可能希望看到的是使用导入csv
——例如:
$entries = Import-Csv "$env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-xxxx.csv"
# replace this:
# while ( ($read=$Reader.ReadLine()) -ne $null) {
# with this:
foreach( $entry in $entries )
{
#Loop through all the record in the CSV file
... do stuff with this $entry...
$entry.Jobstream = $entry.Jobstream.trimStart('PAXCL')
... etc ...
}
或者,您可以取消对Get Content
的调用:
$ElementCsv = "$env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20200925-xxxx.csv"
...
$Reader = New-Object System.IO.StreamReader -Argument $ElementCsv
现在PowerShell能够找到一个采用单个字符串参数的构造函数:
在任何情况下,一旦你有了工作,我想你会发现你的sc中有很多其他的问题