Php multi-curl和simplehtmldom存在问题，只抓取标题？_Php_Dom_Curl

Php multi-curl和simplehtmldom存在问题，只抓取标题？

php dom curl

Php multi-curl和simplehtmldom存在问题，只抓取标题？,php,dom,curl,Php,Dom,Curl,我在simplehtmldom中使用multi-curl 我在simplehtmldom上阅读了这本手册：示例是使用curl抓取1个网站，我正在尝试抓取多个，我正在使用multi-curl 但是，当我尝试将multi-curl与simplehtmldom一起使用时，我从页面的标题部分得到了一个错误，它向我显示了在simple_html_dom.php的第39行有一个错误 $dom->load(call_user_func_array('file_get_contents', $ar

我在simplehtmldom中使用multi-curl

我在simplehtmldom上阅读了这本手册：示例是使用curl抓取1个网站，我正在尝试抓取多个，我正在使用multi-curl

但是，当我尝试将multi-curl与simplehtmldom一起使用时，我从页面的标题部分得到了一个错误，它向我显示了在simple_html_dom.php的第39行有一个错误

    $dom->load(call_user_func_array('file_get_contents', $args), true);

从这里

// get html dom form file
function file_get_html() {
    $dom = new simple_html_dom;
    $args = func_get_args();
    $dom->load(call_user_func_array('file_get_contents', $args), true);
    return $dom;
}

这是我的multi-curl脚本

$urls = array(
   "http://www.html2.com", //$res[0]
   "http://www.html1.com" //$res[1]
   );

$mh = curl_multi_init();

foreach ($urls as $i => $url) {
       $conn[$i]=curl_init($url);
       curl_setopt($conn[$i],CURLOPT_RETURNTRANSFER,1);//return data as string 
       curl_setopt($conn[$i],CURLOPT_FOLLOWLOCATION,1);//follow redirects
       curl_setopt($conn[$i],CURLOPT_MAXREDIRS,2);//maximum redirects
       curl_setopt($conn[$i],CURLOPT_CONNECTTIMEOUT,10);//timeout
       curl_multi_add_handle ($mh,$conn[$i]);
}

do { $n=curl_multi_exec($mh,$active); } while ($active);

foreach ($urls as $i => $url) {
       $res[$i]=curl_multi_getcontent($conn[$i]);
       curl_multi_remove_handle($mh,$conn[$i]);
       curl_close($conn[$i]);

}
curl_multi_close($mh);

我用了这个

$html = file_get_html($res[0]);

请帮帮我

谢谢

您可能会遇到以下错误：

Warning: file_get_contents(): Filename cannot be empty in /tmp/simple_html_dom.php on line 39

这表明，由于某种原因，传递到文件_get_html（）（$res[0]）中的内容是空的，这很可能是因为需要一些额外的/不同的CURL参数。事实上，如果你回显循环中的$res[$i]，你会看到这一点

一旦您解决了这个问题，您将遇到另一个问题-您正在尝试将刚才刮取的html内容传递到文件\u get\u html（），该文件需要某种文件路径，而不是内容。事实上，file\u get\u内容可以从标准url中提取，因此如果file\u get\u内容能够正确提取数据，则可以完全跳过所有curl内容

如果要保留curl调用，那么应该将$res[0]传递到str_get_html（），而不是file_get_html（）。

perfect！是的，我想保留CURL参数，因为我不想让htmlsimpledom单独获取文件。据我所知，CURL multi（我已经设置）一次获取所有页面，然后输出其.andrew-您介意使用str_get_html（）共享您的工作脚本吗？