PowerShell如何按ID将列从另一个CSV文件添加到CSV文件?

PowerShell如何按ID将列从另一个CSV文件添加到CSV文件?,powershell,csv,Powershell,Csv,我有两个CSV文件。第一个文件可能包含不同数量的行。每行都有ID。在本例中-place\u ID。 我想在这个文件中添加第二列 "place_id";"osm_type";"osm_id";"place_rank";"boundingbox";"lat";"lon";"display_name";"class";"type";"importance";"icon";"postcode";"city";"town";"village";"hamlet";"allotments";"neighbou

我有两个CSV文件。第一个文件可能包含不同数量的行。每行都有ID。在本例中-
place\u ID
。 我想在这个文件中添加第二列

"place_id";"osm_type";"osm_id";"place_rank";"boundingbox";"lat";"lon";"display_name";"class";"type";"importance";"icon";"postcode";"city";"town";"village";"hamlet";"allotments";"neighbourhood";"suburb";"city_district";"state_district";"building";"address100";"address26";"address27";"address29";"county";"state";"country";"country_code";"place";"population";"wikidata";"wikipedia";"name";"official_name"
"100073243";"way";"108738557";"19";"56.1330951,56.1377776,35.7857419,35.7966764";"56.1354281";"35.7903646";"Bolshoe Syrkovo, Volokolamskij gorodskoj okrug, Moskovskaya oblast, CFO, RF";"place";"hamlet";"0.45401456808503";"https://nominatim.openstreetmap.org/images/mapicons/poi_place_village.p.20.png";"";"";"";"";"Bolshoe Syrkovo";"";"";"";"";"";"";"";"";"";"";"Volokolamskij gorodskoj okrug";"Moskovskaya oblast";"RF";"ru";"hamlet";"19";"Q4092451";"ru:Bolshoe Syrkovo";"Bolshoe Syrkovo";""
"100073263";"way";"108729132";"19";"56.1542386,56.156816,36.3303962,36.3383278";"56.15552975";"36.3343542260811";"Kondratovo, Volokolamskij gorodskoj okrug, Moskovskaya oblast, CFO, RF";"place";"hamlet";"0.385";"https://nominatim.openstreetmap.org/images/mapicons/poi_place_village.p.20.png";"";"";"";"";"Kondratovo";"";"";"";"";"";"";"";"";"";"";"Volokolamskij gorodskoj okrug";"Moskovskaya oblast";"RF";"ru";"";"";"";"";"Kondratovo";""
"100073265";"way";"108738571";"19";"56.009293,56.0205996,36.2239313,36.2390323";"56.015194";"36.2290485";"Gryady, Volokolamskij gorodskoj okrug, Moskovskaya oblast, CFO, Rossiya";"place";"village";"0.36089190172262";"https://nominatim.openstreetmap.org/images/mapicons/poi_place_village.p.20.png";"";"";"";"Gryady";"";"";"";"";"";"";"";"";"";"";"";"Volokolamskij gorodskoj okrug";"Moskovskaya oblast";"Rossiya";"ru";"village";"841";"Q4151063";"ru:Gryady (Moskovskaya oblast)";"Gryady";""
和第二个文件。此文件包含地理坐标的完整基础。每行都有与第一个文件中的
place\u id
行匹配的
place\u id
列。 第二个文件-我想从
geojson
列复制字符串,并通过
place\u id
添加到第一个文件中。 此文件比第一个文件大。(第一个大约5 Mb,第二个大约50 Mb。)

我想这对一个知识渊博的程序员来说并不难。我不是那样的)

我试过很多密码。但没有一个不适合我。 下面我将列出我尝试过的代码。我不擅长编程。也许我不明白某些代码的含义

请帮我办案

###
Get-ChildItem -Filter .\comb\*.csv | Select-Object -ExpandProperty FullName | Import-Csv | Export-Csv .\combinedcsvs.csv -NoTypeInformation -Append
###

###
$DevData = (Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8)[1..10]
$ProdData = (Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8)[1..10]
# throw one set into a hashtable
# we can use this as a lookup table for the other set
$ProdTable = @{}
foreach($line in $ProdData){
    $ProdTable[$line.place_id] = $line.ID
}
# Output the DevData with the appropriate ProdData value
$DevData | Select-Object @{Label='DevID';Expression={$_.ID}},@{Label='ProdID';Expression={$ProdTable[$_.place_id]}},place_id | Export-Csv .\new2.csv -NoTypeInformation -Delimiter ";" -Encoding:UTF8
###

###
$f1=(Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8 -header "place_id","osm_type","osm_id","place_rank","boundingbox","lat","lon","display_name","class","type","importance","icon","postcode","city","town","village","hamlet","allotments","neighbourhood","suburb","city_district","state_district","building","address100","address26","address27","address29","county","state","country","country_code","place","population","wikidata","wikipedia","name","official_name")[1..1]
$f1
$f2=(Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8 -header samname,"place_id","osm_id","geojson")[1..1]
$f1|
   %{
      $geojson=$_.geojson
      $m=$f2|?{$_.geojson -eq $geojson}
      $_.place_id=$m.place_id
    }
$f1
###


###
#Make an empty hash table for the first file
$File1Values = @{}
#Import the first file and save the rows in the hash table indexed on "place_id"
Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8 | ForEach-Object {
  $File1Values.Add($_.place_id, $_)
}
#Import the second file and make a custom object with properties from both files
Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8 | ForEach-Object {
  [PsCustomObject]@{
    ABC = $File1Values[$_.KeyColumn].ABC;
    DEF = $File1Values[$_.KeyColumn].DEF;
    UVW = $_.UVW;
    XYZ = $_.XYZ;
  }
} | Export-Csv -Path c:\OutFile.csv
###

###
$Poproperties = @(
'worker_name',
'requester_name',
@{E={$Lookup_Hash.($_.field_834)};L='field_834'},
@{E={$Lookup_Hash.($_.field_835)};L='field_835'},
@{E={$Lookup_Hash.($_.field_836)};L='field_836'},
@{E={$Lookup_Hash.($_.field_837};L='field_837'},
@{E={$Lookup_Hash.($_.field_838)};L='field_838'}
)
Import-Csv -Path C:\S_FilePath | Select-Object -Property $Poproperties
###


###
$Lookup_Hash = Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8 | ForEach-Object -Process { $_.place_id = $_.name }
$S_File = Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8 | Select-Object -Property *,@{E={$Lookup_Hash.($_.place_id)};L='place_id'} | Export-Csv ".\pars_full_5_combine_geo.csv" -NoTypeInformation -Delimiter ";" -Encoding:UTF8
###

这是我创建的一个工作示例,它展示了一种可以实现这一点的方法

我创建了两个csv文件

powershell -ExecutionPolicy RemoteSigned .\test.ps1
file1.csv

"id";"score"
"1";"90"
"3";"100"
文件2.csv

"id";"firstname";"lastname"
"1";"steve";"jobs"
"2";"bill";"gates"
"3";"santa";"claus"
然后是我的powershell脚本test.ps1

$csv1=(import-csv file1.csv -Delimiter ";")
$csv2=(import-csv file2.csv -Delimiter ";")
$csv1 |
    ForEach-Object{
        $row = $_
        if($mtch = $csv2|?{$_.id -eq $row.id}){
                $out = [pscustomobject]@{ id =  $row.id; firstname = $mtch.firstname; lastname = $mtch.lastname; score = $row.score }
                $out
           }
     } | Export-Csv csv3.csv -NoTypeInformation
这就是我运行脚本的方式(与csv文件位于同一目录中)

powershell -ExecutionPolicy RemoteSigned .\test.ps1
这是结果,csv3.csv

"id","firstname","lastname","score"
"1","steve","jobs","90"
"3","santa","claus","100"

添加适合我的任务的代码。我将方案分为3个步骤

$FileWithOutGeom = Import-Csv ".\FileWithOutGeom.csv" -Delimiter ';' -Encoding UTF8

# step 1. getting all IDs from file without coordinates - sort by ID and select place_id column values. I use join with delimiter '|' to bring data in a suitable format for next step. (for where-obgect -match)
$ID = [string]::Join("|",( $FileWithOutGeom | sort place_id | Select-Object -ExpandProperty 'place_id'))

# step 2. take second file with all coordinates and select from them only those rows which ID have in first file and sort by ID too
$FileWithAllGeom = Import-Csv ".\FileWithAllGeom.csv" -Delimiter ';' -Encoding UTF8 | Where-Object -property place_id -Match $ID | sort place_id

# step 3. take first file without geom and add-member - new column name (geojson) and values for this column from step 2 with add increment for each-object
$FileWithOutGeom | ForEach-Object -Begin {$i = 0} {$_ | Add-Member -MemberType NoteProperty -Name 'geojson' -Value ($FileWithAllGeom)[$i++].geojson -PassThru 
} | Export-Csv ".\CombinedFile.csv" -NoTypeInformation -Delimiter ";" -Encoding:UTF8
在出口处,我有一个文件,第一个文件末尾有“geojson”列。 很抱歉,可能是糟糕的代码。我是从网络中找到的片段组合而成的代码。
这个方案对我的任务非常有效。大约50 mb的文件和另外20 mb的文件-在不到10秒内处理。

Thx!这对小文件非常有效!我的文件大约50 mb。现在已经40分钟了,过程还没有结束…我需要寻找变通方法.Thx来对我的问题做出反应!为什么不导入数据库?还有,csv1可以被设置为更大的文件,这样你就可以在更大的文件中循环,并对照较小的文件进行检查。关于导入到数据库-这是个好问题)如果我找不到可接受的选项,我想这将是我的决定。我知道如何在windows环境中准备我的任务所需的csv文件。我不打棒球。我会看到。。。如何使csv1文件更大?我添加了自己的决定,这对我的任务非常有效。)一切都很好)Thx!