Php 如何在XPath中有条件地选择紧跟在元素后面的文本()?

Php 如何在XPath中有条件地选择紧跟在元素后面的文本()?,php,xml,xpath,tree-traversal,dom-traversal,Php,Xml,Xpath,Tree Traversal,Dom Traversal,我有以下结构,其中子节点按随机顺序排列: <span id="outer"> <div style="color:blue">51</div> <span class="main">Gill</span>$500 <span style="color:red">11</span> <span></span>James <div sty

我有以下结构,其中子节点按随机顺序排列:

<span id="outer">
     <div style="color:blue">51</div>
     <span class="main">Gill</span>$500
     <span style="color:red">11</span>
     <span></span>James
     <div style="color:red">158</div>
     <div class="sub">Mary</div>
</span>
我用PHP编写了以下代码来遍历元素。如果这部分内容冗长,你可以跳过阅读主要焦点是$expression,用于选择text()节点值(如果它在元素之后立即出现)

$nodes = $xpath->query("//span[@id='outer']/*");
$str_out = "";
foreach($nodes as $node)
{
    if($node->hasAttribute('class')
    {
        if($node->getAttribute('class')=="main")
            $str_out .= $node->nodeValue . " ";
    }

    else if($node->hasAttribute('style')
    {
        $node_style = $node->getAttribute('style');
        preg_match('~color:(.*)~', $node_style, $temp);
        if( $temp[1] == "red" )
            $str_out .= $node->nodeValue . " ";
    }

    // Now evaluate if the IMMEDIATELY next sibling is text()

    $next_node = $xpath->query('.//following-sibling::*[1]', $node);        
    if($next_node->length)
    {
        $next_node = $next_node->item(0);
        $next_node_name = $next_node->nodeName;         
        $next_node_value =  $next_node->nodeValue;
        $current_node_name = $node->nodeName;

        $expression = ".//following-sibling::text()[1][preceding-sibling::".$current_node_name." and following-sibling::".$next_node_name."[contains(text(),'".$next_node_value."')]]";

        $text_node = $xpath->query($expression, $node);
        if($text_node->length)              
        {           
            $str_out .= $text_node->item(0)->nodeValue . " ";               
        }
    }
}
echo $str_out;
如前所述,主要关注点是捕获text()节点值(如果在元素之后立即出现)。我想编写一个XPATH表达式,它执行以下操作: 1.选择元素后的第一个text()节点 2.检查此text()节点是否位于self节点(present节点)和紧接其后的节点之间

例如,在此块中:

<span></span>James
<div style="color:red">158</div>
<span style="color:red">11</span>
<span></span>James
<div style="color:red">158</div>

您可以通过以下方式实现这一点:

<?php
$xmldoc = new DOMDocument();
$xmldoc->loadXML(<<<XML
<span id="outer">
     <div style="color:blue">51</div>
     <span class="main">Gill</span>$500
     <span style="color:red">11</span>
     <span></span>James
     <div style="color:red">158</div>
     <div class="sub">Mary</div>
</span>
XML
);
$xpath = new Domxpath($xmldoc);

$nodes = $xpath->query("//span[@id='outer']/*");
$str_out = "";
foreach ($nodes as $node)
{
    if ($node->hasAttribute('class'))
    {
        if ($node->getAttribute('class') == "main")
            $str_out .= $node->nodeValue . " ";
    }

    else if ($node->hasAttribute('style'))
    {
        $node_style = $node->getAttribute('style');
        preg_match('~color:(.*)~', $node_style, $temp);
        if ($temp[1] == "blue")
            $str_out .= $node->nodeValue . " ";
    }

    // Now evaluate if the IMMEDIATELY next sibling is text()
    $next_node = $xpath->query('./following-sibling::node()[1]/self::text()[normalize-space()]', $node);
    if ($next_node->length)
    {
        $str_out .= trim($next_node->item(0)->nodeValue) . " ";
    }
}
echo $str_out;
说:

  • 来自上下文节点的
  • following sibling::node()[1]
    获取后面的第一个同级节点(无论是文本节点还是元素(甚至是注释))
  • self::text()[normalize-space()]
    如果“current”节点是文本节点,并且不只是由空格组成,则使用该节点
输出为:

51吉尔500美元詹姆斯


这也将处理这样的情况,即您可以在父元素的最后一个子元素之后有一个文本节点。您可以通过以下方法实现这一点:

<?php
$xmldoc = new DOMDocument();
$xmldoc->loadXML(<<<XML
<span id="outer">
     <div style="color:blue">51</div>
     <span class="main">Gill</span>$500
     <span style="color:red">11</span>
     <span></span>James
     <div style="color:red">158</div>
     <div class="sub">Mary</div>
</span>
XML
);
$xpath = new Domxpath($xmldoc);

$nodes = $xpath->query("//span[@id='outer']/*");
$str_out = "";
foreach ($nodes as $node)
{
    if ($node->hasAttribute('class'))
    {
        if ($node->getAttribute('class') == "main")
            $str_out .= $node->nodeValue . " ";
    }

    else if ($node->hasAttribute('style'))
    {
        $node_style = $node->getAttribute('style');
        preg_match('~color:(.*)~', $node_style, $temp);
        if ($temp[1] == "blue")
            $str_out .= $node->nodeValue . " ";
    }

    // Now evaluate if the IMMEDIATELY next sibling is text()
    $next_node = $xpath->query('./following-sibling::node()[1]/self::text()[normalize-space()]', $node);
    if ($next_node->length)
    {
        $str_out .= trim($next_node->item(0)->nodeValue) . " ";
    }
}
echo $str_out;
说:

  • 来自上下文节点的
  • following sibling::node()[1]
    获取后面的第一个同级节点(无论是文本节点还是元素(甚至是注释))
  • self::text()[normalize-space()]
    如果“current”节点是文本节点,并且不只是由空格组成,则使用该节点
输出为:

51吉尔500美元詹姆斯


这还将处理这样的场景:在父级Xpath支持的轴的最后一个子元素之后可以有一个文本节点。使用它们,您可以指定最初匹配的节点。默认轴是
子轴
@
属性
的缩写。在这种情况下,您需要的轴是
后面的兄弟姐妹
自身

如果您正在使用
span[@class=“main”]
指定标记节点,则可以将其扩展到
span[@class=“main”]/following sibling::node()[1]
,并获取以下节点。确保它是一个具有
span[@class=“main”]/以下同级::node()[1]/self::text()

此时您正在迭代所有节点,但除了
样式
属性之外,您可以直接在Xpath中匹配这些值。对于样式条件,您可以使用PHP回调:

$xml = <<<'XML'
<span id="outer">
     <div style="color:blue">51</div>
     <span class="main">Gill</span>$500
     <span style="color:red">11</span>
     <span></span>James
     <div style="color:red">158</div>
     <div class="sub">Mary</div>
</span>
XML;

function getStyleProperty($node, $name) { 
  if (is_array($node)) {
    $node = $node[0];
  }
  if ($node instanceof DOMElement) {
    $pattern = sprintf(
    '(\b%s:\s*([^;]*)\s*(;|$))', preg_quote($name)
    );
    if (preg_match($pattern, $node->getAttribute('style'), $matches)) {
      return $matches[1];
    }
  }
  return '';
}

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('php', 'http://php.net/xpath');
$xpath->registerPHPFunctions(['getStyleProperty']);

foreach ($xpath->evaluate('//span[@id="outer"]')as $outer) {
  var_dump(
    $xpath->evaluate('string(div[php:function("getStyleProperty", ., "color") = "blue"])', $outer),
    $xpath->evaluate('string(span[@class = "main"])', $outer),
    $xpath->evaluate('string(span[@class = "main"]/following-sibling::text()[1])', $outer),
    $xpath->evaluate('string(span[not(@class or @style)]/following-sibling::node()[1]/self::text())', $outer)
  );
}

Xpath支持轴。使用它们,您可以指定最初匹配的节点。默认轴是
子轴
@
属性
的缩写。在这种情况下,您需要的轴是
后面的兄弟姐妹
自身

如果您正在使用
span[@class=“main”]
指定标记节点,则可以将其扩展到
span[@class=“main”]/following sibling::node()[1]
,并获取以下节点。确保它是一个具有
span[@class=“main”]/以下同级::node()[1]/self::text()

此时您正在迭代所有节点,但除了
样式
属性之外,您可以直接在Xpath中匹配这些值。对于样式条件,您可以使用PHP回调:

$xml = <<<'XML'
<span id="outer">
     <div style="color:blue">51</div>
     <span class="main">Gill</span>$500
     <span style="color:red">11</span>
     <span></span>James
     <div style="color:red">158</div>
     <div class="sub">Mary</div>
</span>
XML;

function getStyleProperty($node, $name) { 
  if (is_array($node)) {
    $node = $node[0];
  }
  if ($node instanceof DOMElement) {
    $pattern = sprintf(
    '(\b%s:\s*([^;]*)\s*(;|$))', preg_quote($name)
    );
    if (preg_match($pattern, $node->getAttribute('style'), $matches)) {
      return $matches[1];
    }
  }
  return '';
}

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('php', 'http://php.net/xpath');
$xpath->registerPHPFunctions(['getStyleProperty']);

foreach ($xpath->evaluate('//span[@id="outer"]')as $outer) {
  var_dump(
    $xpath->evaluate('string(div[php:function("getStyleProperty", ., "color") = "blue"])', $outer),
    $xpath->evaluate('string(span[@class = "main"])', $outer),
    $xpath->evaluate('string(span[@class = "main"]/following-sibling::text()[1])', $outer),
    $xpath->evaluate('string(span[not(@class or @style)]/following-sibling::node()[1]/self::text())', $outer)
  );
}
$xml = <<<'XML'
<span id="outer">
     <div style="color:blue">51</div>
     <span class="main">Gill</span>$500
     <span style="color:red">11</span>
     <span></span>James
     <div style="color:red">158</div>
     <div class="sub">Mary</div>
</span>
XML;

function getStyleProperty($node, $name) { 
  if (is_array($node)) {
    $node = $node[0];
  }
  if ($node instanceof DOMElement) {
    $pattern = sprintf(
    '(\b%s:\s*([^;]*)\s*(;|$))', preg_quote($name)
    );
    if (preg_match($pattern, $node->getAttribute('style'), $matches)) {
      return $matches[1];
    }
  }
  return '';
}

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('php', 'http://php.net/xpath');
$xpath->registerPHPFunctions(['getStyleProperty']);

foreach ($xpath->evaluate('//span[@id="outer"]')as $outer) {
  var_dump(
    $xpath->evaluate('string(div[php:function("getStyleProperty", ., "color") = "blue"])', $outer),
    $xpath->evaluate('string(span[@class = "main"])', $outer),
    $xpath->evaluate('string(span[@class = "main"]/following-sibling::text()[1])', $outer),
    $xpath->evaluate('string(span[not(@class or @style)]/following-sibling::node()[1]/self::text())', $outer)
  );
}
string(2) "51"
string(4) "Gill"
string(10) "$500
     "
string(11) "James
     "