Bash 使用awk或sed将XML转换为管道分隔的输出文件

Bash 使用awk或sed将XML转换为管道分隔的输出文件,bash,Bash,如何使用awk或sed将以下XML标记转换为带有管道分隔文件的文本。 我尝试使用以下awk,但它没有从内容类型标记返回全文。任何帮助都会很好 Input_file.dat <entry> <updated>2014-05-17T16:34:00-07:00</updated> <id>994568497</id> <title&

如何使用awk或sed将以下XML标记转换为带有管道分隔文件的文本。 我尝试使用以下awk,但它没有从内容类型标记返回全文。任何帮助都会很好

Input_file.dat

        <entry>
            <updated>2014-05-17T16:34:00-07:00</updated>
                <id>994568497</id>
                <title>No longer usable</title>
                <content type="text">I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.</content>
                <im:contentType term="Application" label="Application"/>
                <im:voteSum>0</im:voteSum>
                <im:voteCount>0</im:voteCount>
                <im:rating>1</im:rating>
                <im:version>4.2.0.165</im:version>
                <author><name>Arcdouble</name><uri>https://test.com/us/reviews/id199894255</uri></author>
        </entry>

下面的代码应该适用于您:

perl -ne '/<\/entry>/ && print "\n"; />(.*?)</ && !/<name>/  && print $1."|"; /<name>/ && /name>?(.*?)<\/.*?(uri>?)(.*)?<\/uri/ && print $1."|".$3'

perl-ne'/&打印“\n”/>(.*)(.*)(.*)(.*)(.*)(.*))(.*)?(.*)?)(.*)?与awk或sed相比,使用类似XSLT的东西或至少使用Python附带的XML解析器(如ElementTree模块)会更幸运。它们分别用于处理记录(有组织的信息字段)或行,而不是XML中的层次结构。是的,没错,但我尝试使用bash脚本,并尝试使用以下命令返回值,但有时会截断文本消息<代码>awk-F'[]'{ORS=“|”};\/“输出文件.csv”};\/“输出文件.csv”};\/“output\u file.csv”};\/>“output_file.csv”}Input_file.dat
请使用合适的xml解析器,您可以选择任何语言,其中有许多很好的解析器。将能够改变这一点。我会提供一个答案,但您没有显示xml名称空间。请不要使用regexp解析xml。有时候我们只需要用一行代码就可以完成工作,但是谢谢你的建议:)不,不要用正则表达式解析xml。请别这样。甚至不要争辩说你需要完成这项工作,因为这项工作从一开始就被打破了。只是不要用正则表达式解析xml。相信我。由于您使用的是Perl,请使用适当的解析器,例如LibXML。我的回答实际上是带着一种幽默的心情(代码>:)。确保你遵循这两个链接(特别是第二个链接非常好)。
perl -ne '/<\/entry>/ && print "\n"; />(.*?)</ && !/<name>/  && print $1."|"; /<name>/ && /name>?(.*?)<\/.*?(uri>?)(.*)?<\/uri/ && print $1."|".$3'
tiago@dell:~$ cat file
        <entry>
            <updated>2014-05-17T16:34:00-07:00</updated>
                <id>994568497</id>
                <title>No longer usable</title>
                <content type="text">I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.</content>
                <im:contentType term="Application" label="Application"/>
                <im:voteSum>0</im:voteSum>
                <im:voteCount>0</im:voteCount>
                <im:rating>1</im:rating>
                <im:version>4.2.0.165</im:version>
                <author><name>Arcdouble</name><uri>https://test.com/us/reviews/id199894255</uri></author>
        </entry>
        <entry>
            <updated>2014-05-17T16:34:00-07:00</updated>
                <id>994568497</id>
                <title>No longer usable</title>
                <content type="text">I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.</content>
                <im:contentType term="Application" label="Application"/>
                <im:voteSum>0</im:voteSum>
                <im:voteCount>0</im:voteCount>
                <im:rating>1</im:rating>
                <im:version>4.2.0.165</im:version>
                <author><name>Arcdouble</name><uri>https://test.com/us/reviews/id199894255</uri></author>
        </entry>
tiago@dell:~$ cat file | perl -ne '/<\/entry>/ && print "\n"; />(.*?)</ && !/<name>/  && print $1."|"; /<name>/ && /name>?(.*?)<\/.*?(uri>?)(.*)?<\/uri/ && print $1."|".$3' 
2014-05-17T16:34:00-07:00|994568497|No longer usable|I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.|0|0|1|4.2.0.165|Arcdouble|https://test.com/us/reviews/id199894255
2014-05-17T16:34:00-07:00|994568497|No longer usable|I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.|0|0|1|4.2.0.165|Arcdouble|https://test.com/us/reviews/id199894255