Php 按顺序解析Dom元素

Php 按顺序解析Dom元素,php,parsing,xpath,domdocument,Php,Parsing,Xpath,Domdocument,我需要解析以下代码 <ul class="zg_hrsr"> <li class="zg_hrsr_item"> <span class="zg_hrsr_rank">#15</span> <span class="zg_hrsr_ladder"> in <a href="http://www.amazon.com/gp/bestsellers/digital-text/ref=pd_zg_hrsr_kstore_1_1">

我需要解析以下代码

<ul class="zg_hrsr">
<li class="zg_hrsr_item">
<span class="zg_hrsr_rank">#15</span>
<span class="zg_hrsr_ladder">
in 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/ref=pd_zg_hrsr_kstore_1_1">Kindle Store</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/154606011/ref=pd_zg_hrsr_kstore_1_2">Kindle eBooks</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/157325011/ref=pd_zg_hrsr_kstore_1_3">Nonfiction</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/292975011/ref=pd_zg_hrsr_kstore_1_4">Lifestyle & Home</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156699011/ref=pd_zg_hrsr_kstore_1_5">Home & Garden</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156828011/ref=pd_zg_hrsr_kstore_1_6">Gardening & Horticulture</a>
 > 
<b>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156847011/ref=pd_zg_hrsr_kstore_1_7_last">Greenhouses</a>
</b>
</span>
</li>
<li class="zg_hrsr_item">
<span class="zg_hrsr_rank">#26</span>
<span class="zg_hrsr_ladder">
in 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/ref=pd_zg_hrsr_kstore_2_1">Kindle Store</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/154606011/ref=pd_zg_hrsr_kstore_2_2">Kindle eBooks</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/157325011/ref=pd_zg_hrsr_kstore_2_3">Nonfiction</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/292975011/ref=pd_zg_hrsr_kstore_2_4">Lifestyle & Home</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156699011/ref=pd_zg_hrsr_kstore_2_5">Home & Garden</a>
 > 
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156828011/ref=pd_zg_hrsr_kstore_2_6">Gardening & Horticulture</a>
 > 
<b>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156849011/ref=pd_zg_hrsr_kstore_2_7_last">House Plants</a>
</b>
</span>
</li>
</ul>
  • #15 在里面 > > > > > >
  • #26 在里面 > > > > > >
我想要的结果是

  • 卖家排名:#266715在Kindle商店付费(见前100名付费商店) Kindle商店)
  • #15在Kindle商店>Kindle电子书>非小说>生活方式与家庭>家庭与花园>园艺与园艺>温室
  • #26在Kindle商店>Kindle电子书>非小说>生活方式与家居>家居与花园>园艺与园艺>室内植物

    我怎样才能做到这一点?我所知道的是,我应该为每个“a”标记获取“nodeValue”,但我很困惑,无法按照我所需的格式获取它们, 我想我应该使用数组,但我不能实现它,因为我的经验水平低

    请给我指引和帮助。我只需要xPath和数组的结构(如果可以使用数组实现的话)或数组的替代品。

    我推荐使用DOM,因为它更易于使用。我不会为您编写代码,但是,如果您查看SimpleXML文档,您将看到如何使用它的示例。如果你无法通过示例使其工作,你可能需要雇佣一名程序员:)唉,再一次,它在这里工作,这次我甚至测试了它,我不知道你是如何做到的;)你确定你有准确的HTML吗?
    //create XPath from you DOM object:
    $xpath = new DOMXPath($dom);
    foreach($xpath->query("//span[@class='zg_hrsr_rank']") as $rank){
        $rank = $rank->textContent;
        $trail = array();
        foreach($xpath->query('//a',$rank) as $step){
            $trail[] = $step->textContent;
        }
        echo $rank.' '.implode(' > ',$trail)."\n";
    }