使用命令行解析XML
如何解析包含以下内容的XML使用命令行解析XML,xml,Xml,如何解析包含以下内容的XML <?xml version="1.0"?> <saw:ibot xmlns:saw="com.siebel.analytics.web/report/v1" version="1" priority="normal" jobID="36 "> <saw:schedule timeZoneId=
<?xml version="1.0"?>
<saw:ibot xmlns:saw="com.siebel.analytics.web/report/v1" version="1" priority="normal" jobID="36 ">
<saw:schedule timeZoneId="(GMT-05:00) Eastern Time (US & Canada)" disabled="false">
<saw:start repeatMinuteInterval="60" endTime="23:59:00" startImmediately="true"/>
<saw:recurrence runOnce="false">
<saw:weekly weekInterval="1" mon="true" tue="true" wed="true" thu="true" fri="true"/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility type="recipient" runAs="cgm"/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arriv al_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
...skipping...
al_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard"/>
<saw:destination category="activeDeliveryProfile"/>
</saw:deliveryDestinations>
<saw:recipients subscribers="true" customize="false" specificRecipients="false">
<saw:subscribers>
<saw:user name="mbussey@xyz.com"/>
<saw:user name="kimmy.chan@pqr.com"/>
<saw:user name="chudgins@gmail.com"/>
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"/>
</saw:conditionQuery>
</saw:ibot>
我还有一个5.xml文件,它具有不同的解析名称值集。无论如何,我们可以在命令行中解析和合并它们,并在一个文件中输出
我尝试了
sed
和awk
选项,但对获得所需的输出没有多大帮助。此命令将解析XML文档,并使用XPath提取位置/saw:ibot/saw:recipients/saw:subscribers/saw:user处元素的名称
属性值
xmlstarlet sel -t -v '/saw:ibot/saw:recipients/saw:subscribers/saw:user/@name' </tmp/xml
此命令将解析XML文档,并使用XPath提取位置/saw:ibot/saw:recipients/saw:subscribers/saw:user处元素的name
属性值
xmlstarlet sel -t -v '/saw:ibot/saw:recipients/saw:subscribers/saw:user/@name' </tmp/xml
使用XML解析器。个人-比如XML::Twig
和perl
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new( );
$twig->parsefile ( 'your_file.xml' );
foreach my $saw_user ( $twig->get_xpath('//saw:user') ) {
print $saw_user ->att('name'), "\n";
}
这张照片是:
mbussey@xyz.com
kimmy.chan@pqr.com
chudgins@gmail.com
如果您想要“一行”,请改为:
perl -MXML::Twig -0777 -e 'print map { $_ -> att('name')."\n"} ( XML::Twig->parse( <> )->get_xpath('//saw:user') )' your_xml_file
perl-MXML::Twig-0777-e'打印映射{$\->att('name')。“\n”}(XML::Twig->parse()->获取xpath('//saw:user'))'您的XML文件
为了将来的维护程序员和系统管理员,请不要使用正则表达式来解析XML。你为什么会问?好吧,因为以XML为例,它可以看起来像其中任何一种,但在语义上仍然相同:
(你的例子+
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1">
<saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)">
<saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/>
<saw:recurrence runOnce="false">
<saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility
runAs="cgm"
type="recipient"
/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard" />
<saw:destination category="activeDeliveryProfile" />
</saw:deliveryDestinations>
<saw:recipients
customize="false"
specificRecipients="false"
subscribers="true">
<saw:subscribers>
<saw:user name="mbussey@xyz.com" />
<saw:user name="kimmy.chan@pqr.com" />
<saw:user name="chudgins@gmail.com" />
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content" />
</saw:conditionQuery>
</saw:ibot>
未来14天的可用性奇偶校验警报(@{NQ_SESSION.LBL_Next_14_Arrival_Days})
或类似于此(注意元素的标记包装)
未来14天的可用性奇偶校验警报(@{NQ_SESSION.LBL_Next_14_Arrival_Days})
或者像这样:
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1"
><saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)"
><saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/><saw:recurrence
runOnce="false"
><saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/></saw:recurrence></saw:schedule><saw:dataVisibility
runAs="cgm"
type="recipient"
/><saw:choose
><saw:when
condition="true"
><saw:deliveryContent
><saw:headline
><saw:caption
><saw:text
>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text></saw:caption></saw:headline><saw:conditionalReport
/></saw:deliveryContent><saw:postActions
/></saw:when><saw:otherwise
/></saw:choose><saw:deliveryDestinations
><saw:destination
category="dashboard"
/><saw:destination
category="activeDeliveryProfile"
/></saw:deliveryDestinations><saw:recipients
customize="false"
specificRecipients="false"
subscribers="true"
><saw:subscribers
><saw:user
name="mbussey@xyz.com"
/><saw:user
name="kimmy.chan@pqr.com"
/><saw:user
name="chudgins@gmail.com"
/></saw:subscribers></saw:recipients><saw:conditionQuery
><saw:reportRefNode
path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"
/></saw:conditionQuery></saw:ibot>
未来14天的可用性奇偶校验警报(@{NQ_SESSION.LBL_Next_14_Arrival_Days})
希望通过查看这些示例,您将看到,通过以完全有效的方式重新格式化XML,您的正则表达式可能有一天会神秘地中断。使用XML解析器。个人喜欢XML::Twig
和perl
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new( );
$twig->parsefile ( 'your_file.xml' );
foreach my $saw_user ( $twig->get_xpath('//saw:user') ) {
print $saw_user ->att('name'), "\n";
}
这张照片是:
mbussey@xyz.com
kimmy.chan@pqr.com
chudgins@gmail.com
如果您想要“一行”,请改为:
perl -MXML::Twig -0777 -e 'print map { $_ -> att('name')."\n"} ( XML::Twig->parse( <> )->get_xpath('//saw:user') )' your_xml_file
perl-MXML::Twig-0777-e'打印映射{$\->att('name')。“\n”}(XML::Twig->parse()->获取xpath('//saw:user'))'您的XML文件
为了将来的维护程序员和系统管理员,请不要使用正则表达式来解析XML。您可能会问为什么?因为以您的XML为例,它可能看起来像其中任何一个,但在语义上仍然相同:
(你的例子+
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1">
<saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)">
<saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/>
<saw:recurrence runOnce="false">
<saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility
runAs="cgm"
type="recipient"
/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard" />
<saw:destination category="activeDeliveryProfile" />
</saw:deliveryDestinations>
<saw:recipients
customize="false"
specificRecipients="false"
subscribers="true">
<saw:subscribers>
<saw:user name="mbussey@xyz.com" />
<saw:user name="kimmy.chan@pqr.com" />
<saw:user name="chudgins@gmail.com" />
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content" />
</saw:conditionQuery>
</saw:ibot>
未来14天的可用性奇偶校验警报(@{NQ_SESSION.LBL_Next_14_Arrival_Days})
或类似于此(注意元素的标记包装)
未来14天的可用性奇偶校验警报(@{NQ_SESSION.LBL_Next_14_Arrival_Days})
或者像这样:
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1"
><saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)"
><saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/><saw:recurrence
runOnce="false"
><saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/></saw:recurrence></saw:schedule><saw:dataVisibility
runAs="cgm"
type="recipient"
/><saw:choose
><saw:when
condition="true"
><saw:deliveryContent
><saw:headline
><saw:caption
><saw:text
>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text></saw:caption></saw:headline><saw:conditionalReport
/></saw:deliveryContent><saw:postActions
/></saw:when><saw:otherwise
/></saw:choose><saw:deliveryDestinations
><saw:destination
category="dashboard"
/><saw:destination
category="activeDeliveryProfile"
/></saw:deliveryDestinations><saw:recipients
customize="false"
specificRecipients="false"
subscribers="true"
><saw:subscribers
><saw:user
name="mbussey@xyz.com"
/><saw:user
name="kimmy.chan@pqr.com"
/><saw:user
name="chudgins@gmail.com"
/></saw:subscribers></saw:recipients><saw:conditionQuery
><saw:reportRefNode
path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"
/></saw:conditionQuery></saw:ibot>
未来14天的可用性奇偶校验警报(@{NQ_SESSION.LBL_Next_14_Arrival_Days})
希望通过查看这些示例,您会发现,通过以完全有效的方式重新格式化XML,您的正则表达式可能有一天会神秘地中断。1.不要使用sed
或awk
解析XML。2.我们无法提供运行代码的示例,而不看到包含要检索的数据的XML。3.不要使用sed
或awk
解析XML。4.请提供一个最小的XML文件示例。5.不要使用sed
或awk
解析XML。我已经格式化了您的问题,XML现在可见。不幸的是,您的示例不是有效的XML文档。您需要格式化内容。在这种情况下,这意味着使用{}
标记将内容缩进四个空格。我将再次为您这样做……这仍然不是有效的XML文档:/tmp/XML:33.18:开始和结束标记不匹配:订阅者行29和收件人
和其他errors@G-伙计,我不认为这是一个重复,因为这是一个关于格式良好的XML文档解析,而你的由于html可能缺乏格式良好的特性,建议的复制需要不同的解决方案。我认为这也不是离题。1.不要用sed
或awk
解析XML。2.我们不能提供运行代码的示例,而不看到包含要检索的数据的XML。3.不要用sed
或awk
.4.请提供一个最小的XML文件示例。5.不要使用sed
或awk
解析XML。我已经格式化了您的问题,XML现在可见。不幸的是,您的示例不是有效的XML文档。您需要格式化内容。在这种情况下,这意味着使用{}
标记将内容缩进四个空格。我将再次为您这样做……这仍然不是有效的XML文档:/tmp/XML:33.18:开始和结束标记不匹配:订阅者行29和收件人
和其他errors@G-伙计,我不认为这是一个重复,因为这是一个关于格式良好的XML文档解析,而你的建议复制需要不同的解决方案,因为可能缺乏格式良好的html。我也不认为这是离题的fwiw。另一方面:人们似乎也喜欢(网站暂时关闭,以及SourceForge的其他部分)。好吧,如果你这么说的话。对我来说不是,很难理解它是如何工作的@