PowerShell中的XML父子属性和元素
我有一些XML数据,这些数据有很多属性和多个同名元素,我想将其展平为CSV文件。数据XML如下所示:PowerShell中的XML父子属性和元素,xml,powershell,Xml,Powershell,我有一些XML数据,这些数据有很多属性和多个同名元素,我想将其展平为CSV文件。数据XML如下所示: <?xml version="1.0" encoding="utf-8"?> <SEGMENTS> <SEGMENT NAME="webcluster"> <RESULTPAGE> <RESULTSET FIRSTHIT="1" LASTHIT="100" HITS="100" TOTALHIT
<?xml version="1.0" encoding="utf-8"?>
<SEGMENTS>
<SEGMENT NAME="webcluster">
<RESULTPAGE>
<RESULTSET FIRSTHIT="1" LASTHIT="100" HITS="100" TOTALHITS="100">
<HIT NO="1" RANK="19000" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">19000</FIELD>
<FIELD NAME="id">1</FIELD>
<FIELD NAME="url">C:\website.com\folder1\file1.txt</FIELD>
<FIELD NAME="filename">file1.txt</FIELD>
<FIELD NAME="path">https://website.com/folder1/</FIELD>
</HIT>
<HIT NO="2" RANK="19000" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">19000</FIELD>
<FIELD NAME="id">2</FIELD>
<FIELD NAME="url">C:\website.com\folder1\file2.txt</FIELD>
<FIELD NAME="filename">file2.txt</FIELD>
<FIELD NAME="path">https://website.com/folder1/</FIELD>
</HIT>
<HIT NO="3" RANK="18999" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">18999</FIELD>
<FIELD NAME="id">3</FIELD>
<FIELD NAME="url">C:\website.com\folder5\file3.txt</FIELD>
<FIELD NAME="filename">file3.txt</FIELD>
<FIELD NAME="path">C:\website.com\folder\</FIELD>
</HIT>
</RESULTSET>
</RESULTPAGE>
</SEGMENT>
</SEGMENTS>
Select-Xml -Xml $xml -XPath '//HIT' | Foreach {
$num=$_.Node.NO
$filenameAttr = $_.Node.Field | where {$_.Name -eq 'filename'}
$pathAttr = $_.Node.Field | where {$_.Name -eq 'path'}
new-object psobject -Property ([ordered]@{HIT=$num; filename = $filenameAttr.InnerText; path = $pathAttr.InnerText})
}
我的代码是:
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
foreach($field in $hit.field)
{
if (field."NAME" -eq 'url')
{
write-output $hit.no $field."#VALUE"
}
}
}
我不断地犯错误。我可以通过按顺序位置($hits[0].field[4])引用不同的元素和属性来访问这些元素和属性,但我希望防止将来出现字段值顺序不同的输出
有人能建议我怎么做吗?我尝试使用select XML,发现这更麻烦,但也许这是更优雅的方法。类似的方法似乎可以做到这一点,尽管我不喜欢这样
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
$result = new-object PSObject -Property @{ hit = $hit.no; filename = ""; path = ""}
foreach($field in $hit.field)
{
if ($field."NAME" -eq 'url')
{
$result.path = $field."#text"
}
if ($field."NAME" -eq 'filename')
{
$result.filename = $field."#text"
}
}
write-output $result
}
或者,只需抓取所有字段,然后选择相关字段:
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
$result = new-object PSObject -Property @{ hit = $hit.no }
$hit.field | % { Add-Member -InputObject $result -MemberType NoteProperty -Name $_."NAME" -Value $_."#text"}
$result | select hit,url,filename | write-output
}
试着这样做:
<?xml version="1.0" encoding="utf-8"?>
<SEGMENTS>
<SEGMENT NAME="webcluster">
<RESULTPAGE>
<RESULTSET FIRSTHIT="1" LASTHIT="100" HITS="100" TOTALHITS="100">
<HIT NO="1" RANK="19000" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">19000</FIELD>
<FIELD NAME="id">1</FIELD>
<FIELD NAME="url">C:\website.com\folder1\file1.txt</FIELD>
<FIELD NAME="filename">file1.txt</FIELD>
<FIELD NAME="path">https://website.com/folder1/</FIELD>
</HIT>
<HIT NO="2" RANK="19000" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">19000</FIELD>
<FIELD NAME="id">2</FIELD>
<FIELD NAME="url">C:\website.com\folder1\file2.txt</FIELD>
<FIELD NAME="filename">file2.txt</FIELD>
<FIELD NAME="path">https://website.com/folder1/</FIELD>
</HIT>
<HIT NO="3" RANK="18999" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">18999</FIELD>
<FIELD NAME="id">3</FIELD>
<FIELD NAME="url">C:\website.com\folder5\file3.txt</FIELD>
<FIELD NAME="filename">file3.txt</FIELD>
<FIELD NAME="path">C:\website.com\folder\</FIELD>
</HIT>
</RESULTSET>
</RESULTPAGE>
</SEGMENT>
</SEGMENTS>
Select-Xml -Xml $xml -XPath '//HIT' | Foreach {
$num=$_.Node.NO
$filenameAttr = $_.Node.Field | where {$_.Name -eq 'filename'}
$pathAttr = $_.Node.Field | where {$_.Name -eq 'path'}
new-object psobject -Property ([ordered]@{HIT=$num; filename = $filenameAttr.InnerText; path = $pathAttr.InnerText})
}
结合方法。选择允许按特定顺序获取字段
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
$r = @{hit = $hit.no; url = "N/A";filename="N/A"}
$hit.field | % { $r[$_."NAME"] = $_."#text" }
New-Object PSObject -Property $r | Select hit,url,filename
}
Keith,尝试上面的代码:找不到类型[ordered]:确保加载了包含此类型的程序集。在第6行,char:45+new object psobject-Property([ordered]啊,这是PowerShell V3的新特性。您可以删除它,但创建的对象中的属性顺序将是随机的。这很有效!谢谢!有人建议脱机:
[xml]$hitfile=Get Content fastxml.xml
foreach($hitfile.segments.segment.resultpage.resultset.hit中的$hit){
$row=$mjatable.NewRow()`$row.hit=$hit.No`$row.InternalID=$hit.field |?{$.name-eq'InternalID'}foreach{$.'.#text''$$row.URL=$hit.field |?${$.name-eq'URL'}foreach{$.'.''.$text$mjaTable | format table-AutoSizeI仍然很好奇,是否可以引用名为attrib=(“url”或“Internalid”)的字段元素并检索“#text”值,而不使用WHERE-OBJECT筛选器,但如果没有,这两个解决方案就完成了任务。再次感谢!