Php 用路径刮桌子_Php_Xpath - Fatal编程技术网

Php 用路径刮桌子

php xpath

Php 用路径刮桌子,php,xpath,Php,Xpath,我在html页面中有一个表，看起来像（pastebin url）我试图从表中获取内容的当前代码是： $html = htmlspecialchars("https://localhost/table.php"); $doc = new \DOMDocument(); if($doc->loadHTML($html)) { $result = new \DOMDocument(); $result->formatOutput = true;

我在html页面中有一个表，看起来像（pastebin url）

我试图从表中获取内容的当前代码是：

$html = htmlspecialchars("https://localhost/table.php");

$doc = new \DOMDocument();

if($doc->loadHTML($html))
{
    $result = new \DOMDocument();
    $result->formatOutput = true;
    $table = $result->appendChild($result->createElement("table"));
    $thead = $table->appendChild($result->createElement("thead"));
    $tbody = $table->appendChild($result->createElement("tbody"));

    $xpath = new \DOMXPath($doc);
    
    $newRow = $thead->appendChild($result->createElement("tr"));
    
    foreach($xpath->query("//table[@id='kurstabell']/thead/tr/th[position()>1]") as $header)
    {
        $newRow->appendChild($result->createElement("th", trim($header->nodeValue)));
    }
    
    foreach($xpath->query("//table[@id='kurstabell']/tbody/tr") as $row)
    {
        $newRow = $tbody->appendChild($result->createElement("tr"));
        
        foreach($xpath->query("./td[position()>1]", $row) as $cell)
        {
            $newRow->appendChild($result->createElement("td", trim($cell->nodeValue)));
        }
    }
    
    echo $result->saveXML($result->documentElement);
}

print_r($result);

（我正在使用htmlspecialchars，因为

libxml\u use\u internal\u errors（true）；

生成错误代码

Europe/Berlin]PHP警告：DOMDocument:：loadHTML（）：htmlParseEntityRef:Entity中应为“；”，行：

，所以我在某个地方读到htmlspecialchars可以使用）

此SNIP的当前结果如下所示：

DOMDocument Object ( [doctype] => [implementation] => (object value omitted) [documentElement] => (object value omitted) [actualEncoding] => [encoding] => [xmlEncoding] => [standalone] => 1 [xmlStandalone] => 1 [version] => 1.0 [xmlVersion] => 1.0 [strictErrorChecking] => 1 [documentURI] => [config] => [formatOutput] => 1 [validateOnParse] => [resolveExternals] => [preserveWhiteSpace] => 1 [recover] => [substituteEntities] => [nodeName] => #document [nodeValue] => [nodeType] => 9 [parentNode] => [childNodes] => (object value omitted) [firstChild] => (object value omitted) [lastChild] => (object value omitted) [previousSibling] => [attributes] => [ownerDocument] => [namespaceURI] => [prefix] => [localName] => [baseURI] => [textContent] => )

php_error.log不会给我任何错误

预期的结果是相同的表，以html进行响应，但删除了所有“不必要”的代码

我的问题：

当前代码有什么问题？

第一行有问题：

$html = htmlspecialchars("https://localhost/table.php");

它应该是：

$html = file_get_contents("https://localhost/table.php");

该函数将转义所有HTML标记，当使用

loadHTML（）

进行分析时，这些标记将返回单个文本节点，而不是预期的DOM。

问题在于第一行：

$html = htmlspecialchars("https://localhost/table.php");

它应该是：

$html = file_get_contents("https://localhost/table.php");

该函数将转义所有HTML标记，当通过

loadHTML（）

解析这些标记时，它们将返回单个文本节点，而不是预期的DOM