尝试使用XML:：LibXML模块拆分XML文件时出错_Xml_Perl_Xpath_Xml Libxml

尝试使用XML:：LibXML模块拆分XML文件时出错

xml perl xpath

尝试使用XML:：LibXML模块拆分XML文件时出错,xml,perl,xpath,xml-libxml,Xml,Perl,Xpath,Xml Libxml,我一直在尝试使用该模块分割XML数据，但它会引发如下错误 Can't call method "findnodes" without a package or object reference <xml> <bhap id="2"> <label>cylind – II</label> <title>AUTHORITIES AND ITS EMPLOYEES</title> <rect i

我一直在尝试使用该模块分割XML数据，但它会引发如下错误

Can't call method "findnodes" without a package or object reference

<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

我的意见

<xml>
  <bhap id="1">
    <label>cylind - I</label>
    <title>premier</title>
    <rect id="S1">
      <title>Short</title>
      <label>1.</label>
      <p><text>welcome</text></p>
    </rect>
    <rect id="S2">
      <title>Definite</title>
      <label>2.</label>
      <p><text>welcome1</text></p>
    </rect>
  </bhap>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
    <rect id=S4">
      <title>Term</title>
      <label>4.</label>
      <p><text>welcome4</text></p>
    </rect>
  </bhap>
</xml>

<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>


cylind-I
总理
短
1.
欢迎光临
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

一定的
2.
欢迎1
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

赛林德-II
当局及其雇员
诺蒂—；
3.
欢迎3
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

学期
4.
欢迎4
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

我的代码

<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

use XML::LibXML;

my $file   = shift || die "usage $0 <xmlfile>";
my $parser = XML::LibXML->new();
my $doc    = $parser->parse_file($file);

my @nodes = $doc->findnodes('//bhap');
foreach my $node1 (@nodes) {

    my $bhap = $node1->toString(), "\n";

    if ( $bhap =~ m/(<bhap.+?>.+?<\/title>)(.+?)(<\/bhap>)/is ) {

        my $bhap1 = $1;
        my $bhap2 = $2;
        my $bhap3 = $3;

        my $nodes1 = $bhap->findnodes('//rect');
        foreach my $node (@$nodes1) {

            my $rect = $node->toString();

            if ( $rect =~ m/(<rect\s*id="(.+?)">.+?<\/rect>)/is ) {

                my $var1 = $1;
                my $var2 = $2;

                print "file" $var2;
                print "<xml>" print $bhap1;
                print $var1;
                print $bhap3;
                print "</xml>";
            }
        }
    }
}

使用XML:：LibXML；
my$file=shift | | die“用法$0”；
my$parser=XML:：LibXML->new（）；
my$doc=$parser->parse_文件（$file）；
my@nodes=$doc->findnodes（'//bhap'）；
foreach my$node1（@nodes）{
我的$bhap=$node1->toString（），“\n”；
如果（$bhap=~m/（.+？）（.+？）（）/is）{
我的$bhap1=1美元；
我的$bhap2=2美元；
我的$bhap3=3美元；
my$nodes1=$bhap->findnodes（'//rect'）；
foreach my$node（@$nodes1）{
my$rect=$node->toString（）；
如果（$rect=~m/（.+？）/is）{
my$var1=$1；
my$var2=$2；
打印“文件”$var2；
“打印”打印$bhap1；
打印$var1；
打印$bhap3；
打印“”；
}
}
}
}

好的，那么你开始的时候很好，但是。。。落入“正则表达式”陷阱。使用正则表达式解析XML不是一件好事，因为它太复杂了——做好它——您需要处理/验证标记嵌套、换行以及各种基本上使正则表达式成为脆弱代码的事情。所以请不要

<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

但最重要的是，在发布查询之前，始终使用

严格

和

警告。这是您进行故障排除的第一个调用端口
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

如果你这样做了，你会看到如下情况：
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

print "file" $var2;

那根本不起作用。还有很多其他的东西不能在“你的代码”中正常工作，所以这才是真正的起点
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

另外-您的XML无效-我认为您的“S4”缺少引号
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

无论如何，假设这只是一个输入错误，我会从开始（因为我比LibXML更了解它，而不是任何特定的原因），然后做如下事情：
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

#!/usr/bin/perl

use strict;
use warnings;
use XML::Twig;

my %children_of;

#as we process, extract all the 'rect' elements - along with a reference to their context.
sub process_rect {
    my ( $twig, $rect ) = @_;
    push( @{ $children_of{ $rect->parent } }, $rect->cut );
}


my $twig = XML::Twig->new(
    'pretty_print'  => 'indented',
    'twig_handlers' => { 'rect' => \&process_rect },

);

$twig->parse( \*DATA );

#run through all the 'bhap' elements. 
foreach my $bhap ( $twig->root->children('bhap') ) {
    #find the rect elements under this bhap. 
    foreach my $rect ( @{ $children_of{$bhap} } ) {
        #create a new XML document - copy the 'root' name from your original document. 
        my $xml    = XML::Twig::Elt->new( $twig -> root -> name );
        #duplicate this 'bhap' element by copying it, rather than cutting it,
        #so we can paste it more than once (e.g. per 'rect')
        my $subset = $bhap->copy;
        #insert the 'bhap' into our new xml. 
        $subset->paste( last_child => $xml );
        #insert our cut rect beneath this bhap. 
        $rect->paste( last_child => $subset );

        #print the resulting XML. 
        print "--\n";
        $xml->print;
    }
}

__DATA__
<xml>

<bhap id="1">
                <label>cylind - I</label>
                <title>premier</title>
                <rect id="S1">
                    <title>Short</title>
                    <label>1.</label>
                    <p><text>welcome</text></p>
                </rect>
                <rect id="S2">
                    <title>Definite</title>
                    <label>2.</label>
                    <p><text>welcome1</text></p>
                </rect>
        </bhap>
            <bhap id="2">
                <label>cylind - II</label>
                <title>AUTHORITIES AND ITS EMPLOYEES</title>

                <rect id="S3">
                    <title>nauty.&#x2014;</title>
                    <label>3.</label>
                    <p><text>welcome3</text></p>
                </rect>

                <rect id="S4">
                    <title>Term</title>
                    <label>4.</label>
                    <p><text>welcome4</text></p>
                </rect></bhap>

</xml>

这看起来至少相当接近你想要生产的产品。我跳过了读入文件和打印内容，因为重构XML是最困难的部分
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>

我还建议您看看XML:：Twig
中提供了哪些功能，因为这可能正是您想要的
 xml_split是否是一个选项：分配给$bhap
等，然后从$bhap
读取。使用使用警告；严格使用捕获这类内容。my$nodes1=$bhap->findnodes（'//rect'）
你在这里用一个字符串调用findnodes。我有一个脚本已经运行了一年多，刚刚开始抛出这个脚本。机器的perl或包安装发生了什么变化？令人厌烦。我相信这都是很好的建议，但问题是错误“在没有包或对象引用的情况下无法调用方法”findnodes“，而您对此一无所知。
<xml>
  <bhap id="2">
    <label>cylind – II</label>
    <title>AUTHORITIES AND ITS EMPLOYEES</title>
    <rect id="S3">
      <title>nauty.&#x2014;</title>
      <label>3.</label>
      <p><text>welcome3</text></p>
    </rect>
  </bhap>
</xml>