Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/248.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/73.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/google-apps-script/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 递归地遍历DOM树并删除不需要的标记?_Php_Html_Dom - Fatal编程技术网

Php 递归地遍历DOM树并删除不需要的标记?

Php 递归地遍历DOM树并删除不需要的标记?,php,html,dom,Php,Html,Dom,我打算在网页的“正文”中循环,删除$tags数组中列出的所有不需要的标记,但我找不到方法。那么我该怎么做呢?$tags=array( $tags = array( "applet" => 1, "script" => 1 ); $html = file_get_contents("test.html"); $dom = new DOMdocument(); @$dom->loadHTML($html); $xpath = new DOMXPath($dom

我打算在网页的“正文”中循环,删除$tags数组中列出的所有不需要的标记,但我找不到方法。那么我该怎么做呢?

$tags=array(
$tags = array(
    "applet" => 1,  
    "script" => 1
);

$html = file_get_contents("test.html");
$dom = new DOMdocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$body = $xpath->query("//body")->item(0);
“小程序”=>1, “脚本”=>1 ); $html=文件获取内容(“test.html”); $dom=新的DOMdocument(); @$dom->loadHTML($html); $xpath=newdomxpath($dom); 对于($i=0;$iquery(“//”)$tags[$i]); 对于($j=0;$jlength;++$j){ $node=$list->item($j); 如果($node==null)继续; $node->parentNode->removeChild($node); } } $string=$dom->saveXML();
类似的事情。

你考虑过吗?从你自己的html清理开始只是重新发明轮子,不容易完成

此外,黑名单方法也不好,请参阅


您可能也对阅读感兴趣或

非常感谢您的提示。我将使用白名单。哦,祝大家新年快乐:)如果未执行删除操作,您应该只增加
$j
。否则将跳过元素。
$tags = array(
    "applet" => 1,  
    "script" => 1
);

$html = file_get_contents("test.html");
$dom = new DOMdocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

for($i=0; $i<count($tags); ++$i) {
   $list = $xpath->query("//".$tags[$i]);
   for($j=0; $j<$list->length; ++$j) {
      $node = $list->item($j);
      if ($node == null) continue;
      $node->parentNode->removeChild($node);
   }
}

$string = $dom->saveXML();