Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/cassandra/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Arrays DOM+XPath将节点从HTML提取到数组(包括父节点)名称+链接(url)_Arrays_Xpath_Extract_Domdocument_Nodes - Fatal编程技术网

Arrays DOM+XPath将节点从HTML提取到数组(包括父节点)名称+链接(url)

Arrays DOM+XPath将节点从HTML提取到数组(包括父节点)名称+链接(url),arrays,xpath,extract,domdocument,nodes,Arrays,Xpath,Extract,Domdocument,Nodes,我需要从数据中提取每个子类别url及其名称和父类别名称+url: <ul class="root"> <li class="category_container"> <h1 class="category"><a href="/url/to/category1">Category 1</a></h1> <ul class="sub_category">

我需要从数据中提取每个子类别url及其名称和父类别名称+url:

<ul class="root">
    <li class="category_container">
        <h1 class="category"><a href="/url/to/category1">Category 1</a></h1>
        <ul class="sub_category">
            <li><a class="sub_cat_link" href="/this/url/i/need/1a">Sub cat A</a></li>
            <li><a class="sub_cat_link" href="/this/url/i/need/1b">Sub cat B</a></li>
        </ul>
    </li>

    <li class="category_container">
        <h1 class="category"><a href="/url/to/category2">Category 2</a></h1>
        <ul class="sub_category">
            <li><a class="sub_cat_link" href="/this/url/i/need/2c">Sub cat C</a></li>
            <li><a class="sub_cat_link" href="/this/url/i/need/2d">Sub cat D</a></li>
        </ul>
    </li>
</ul>

通过使用文档DOM和XPath,我可以很容易地找到具有子类X的节点,但我不知道如何获得我找到的子类X的父节点名Category 1+链接。也许我应该采用不同的方法,首先查找Category X,然后再深入挖掘以获取其所有子类别X节点?请帮助提供应该使用的示例命令。

使用context node作为DOMXPath::query的第二个参数是一种解决方案,只需首先查找即可

<li class="category_container">
并使用描述内部节点的XPath和上下文节点对象li class=category_container(在本例中为第二个参数)通过foreach遍历返回的对象,并再次进行搜索 这里有一些例子:

<li class="category_container">