Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/289.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
解析HTML以查找PHP中的某些元素_Php_Parsing_Xpath_Web Scraping_Domdocument - Fatal编程技术网

解析HTML以查找PHP中的某些元素

解析HTML以查找PHP中的某些元素,php,parsing,xpath,web-scraping,domdocument,Php,Parsing,Xpath,Web Scraping,Domdocument,我使用CURL检索页面并存储HTML。我成功地做到了这一点,最终得到了一个包含类似于此的HTML的变量(td中的内容不同,并且总是发生变化): 为了实现输出数组的结构(减去文本键,如“first_content”等),然后为每一行向数组添加一个新维度并填充该维度。我想这就是你一直想要达到的目标 $dom = new DOMDocument; @$dom->loadHTML( $retrievedHtml ); $xPath = new DOMXpath($dom); $xPathQue

我使用CURL检索页面并存储HTML。我成功地做到了这一点,最终得到了一个包含类似于此的HTML的变量(td中的内容不同,并且总是发生变化):


为了实现输出数组的结构(减去文本键,如“first_content”等),然后为每一行向数组添加一个新维度并填充该维度。我想这就是你一直想要达到的目标

$dom = new DOMDocument;
@$dom->loadHTML( $retrievedHtml );

$xPath = new DOMXpath($dom);

$xPathQuery = "//tr[@class='myclass']";
$elements = $xPath -> query( $xPathQuery );

if( !is_null( $elements ) ){

    $results = array();

    foreach( $elements as $index => $element ){

        $nodes = $element -> childNodes;

        foreach( $nodes as $subindex => $node ){
            /* Each table row is assigned in new level in array using $index */
            if( $node->nodeType == XML_ELEMENT_NODE ) $results[ $index ][] = $node->nodeValue;
        }
    }

    echo '<pre>',print_r( $results, true ),'</pre>';
}
$dom=新的DOMDocument;
@$dom->loadHTML($retrievedHtml);
$xPath=newdomxpath($dom);
$xPathQuery=“//tr[@class='myclass']”;
$elements=$xPath->query($xPathQuery);
如果(!为null($elements)){
$results=array();
foreach($index=>$element形式的元素){
$nodes=$element->childNodes;
foreach($nodes作为$subindex=>$node){
/*使用$index在数组中的新级别中分配每个表行*/
如果($node->nodeType==XML\u ELEMENT\u node)$results[$index][=$node->nodeValue;
}
}
回显“”,打印($results,true),“”;
}

这很有效。有什么我应该注意的吗?例如,如果nodetype不是XML\u元素\u节点?不知道那是什么意思
$result[0]["first_content"] = "Dynamic Content One"
$result[0]["second_content"] = "Dynamic Content Two"
$result[0]["third_content"] = "Dynamic Content Three"

$result[1]["first_content"] = "Dynamic Content One"
$result[1]["second_content"] = "Dynamic Content Two"
$result[1]["third_content"] = "Dynamic Content Three"

.. more elements in array depending on how many <tr> there was
$dom = new DOMDocument;
        @$dom -> loadHTML($retrievedHtml);

        $xPath = new DOMXpath($dom);

        $xPathQuery = "//tr[@class='myclass']";
        $elements = $xPath -> query($xPathQuery);

        if(!is_null($elements)){

            $results = array();

            foreach($elements as $element){

                $nodes = $element -> childNodes;

                print $nodes -> nodeValue;

                foreach($nodes as $node){
                    $results[] = $node -> nodeValue;
                }

            }
$dom = new DOMDocument;
@$dom->loadHTML( $retrievedHtml );

$xPath = new DOMXpath($dom);

$xPathQuery = "//tr[@class='myclass']";
$elements = $xPath -> query( $xPathQuery );

if( !is_null( $elements ) ){

    $results = array();

    foreach( $elements as $index => $element ){

        $nodes = $element -> childNodes;

        foreach( $nodes as $subindex => $node ){
            /* Each table row is assigned in new level in array using $index */
            if( $node->nodeType == XML_ELEMENT_NODE ) $results[ $index ][] = $node->nodeValue;
        }
    }

    echo '<pre>',print_r( $results, true ),'</pre>';
}