Php cURL不返回某些站点的任何内容_Php_Curl

Php cURL不返回某些站点的任何内容

php curl

Php cURL不返回某些站点的任何内容,php,curl,Php,Curl,我正在执行一个cURL请求，在大多数情况下它都可以工作，但对于某些站点，它不会返回任何内容，也不会返回cURL错误。有人能帮我吗这是我的小应用程序：转到此处并进入此网站：正如你可以看到很多网站的工作，但选定的不。。。有什么想法吗这是我的密码： <?php //Bot Curl Request $handle = curl_init(); curl_setopt_array($handle,array( CURLOPT_URL => $_GET['s

我正在执行一个cURL请求，在大多数情况下它都可以工作，但对于某些站点，它不会返回任何内容，也不会返回cURL错误。有人能帮我吗

这是我的小应用程序：

转到此处并进入此网站：

正如你可以看到很多网站的工作，但选定的不。。。有什么想法吗

这是我的密码：

<?php
//Bot Curl Request  


$handle = curl_init();

curl_setopt_array($handle,array(
         CURLOPT_URL => $_GET['site'],
         CURLOPT_USERAGENT => 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
         CURLOPT_RETURNTRANSFER => true,
         CURLOPT_FOLLOWLOCATION => true
      ));

    $output = curl_exec($handle);

    $httpcode = curl_getinfo($handle, CURLINFO_TOTAL_TIME);
    $connecttime = curl_getinfo($handle, CURLINFO_CONNECT_TIME);
    $downloadtime = curl_getinfo($handle, CURLINFO_SPEED_DOWNLOAD);
    $downloadsize = curl_getinfo($handle, CURLINFO_SIZE_DOWNLOAD);

    if(curl_errno($handle)){
        echo '<img class="errorlogo" src="http://www.convurgency.com/images/logo103.png" />';
        echo '<p style="text-align:center;">There was an error finding your site, are you sure it exists?</p>';
        echo '<p style="text-align:center;"><a href="http://www.convurgency.com/tools/googlebot.php">Back to GoogleBot View</a></p>';
        echo 'Curl error: ' . curl_error($handle);

    } else {

        echo 'No Errors';

    };

    if (curl_error($handle)) {
     print "ERROR ". curl_error($handle) ."\n<br/>";
    }


     curl_close($handle);


     $output2 = preg_replace(
        array(
         // Remove invisible content
        '@<head[^>]*?>.*?</head>@siu',
        '@<style[^>]*?>.*?</style>@siu',
        '@<script[^>]*?.*?</script>@siu',
        '@<object[^>]*?.*?</object>@siu',
        '@<embed[^>]*?.*?</embed>@siu',
        '@<applet[^>]*?.*?</applet>@siu',
        '@<noframes[^>]*?.*?</noframes>@siu',
        '@<noscript[^>]*?.*?</noscript>@siu',
        '@<noembed[^>]*?.*?</noembed>@siu',
        // Add line breaks before and after blocks
        '@</?((address)|(blockquote)|(center)|(del))@iu',
        '@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu',
        '@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu',
        '@</?((table)|(th)|(td)|(caption))@iu',
        '@</?((form)|(button)|(fieldset)|(legend)|(input))@iu',
        '@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu',
        '@</?((frameset)|(frame)|(iframe))@iu',
        ),
                    array(' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", ), $output );

 echo preg_replace('/<(\w+) [^>]+>/', '<$1>', $output2);

 ?>

查看$connecttime和$downloadtime，检查请求是否超时。使用命令行curl或wget检查是否可以从运行脚本的服务器访问网站。

目标网站URL执行重定向可能有问题

默认情况下，当curl在发出请求后收到重定向时，它不会向新URL发出请求。作为一个例子，考虑URL。当您使用此URL发出请求时，服务器将向发送HTTP 3XX重定向。如果尝试以下操作，将得到空输出：

$ curl http://www.facebook.com
$

如果希望cURL遵循这些重定向，则应使用-L选项

$ curl -L http://www.facebook.com/
<full content of raw html page here>

$curl-Lhttp://www.facebook.com/

谢谢你。。。修正了密码。。。你知道为什么有些网站不会返回，而其他网站会返回吗？你试过使用普通的桌面用户代理吗？如果有效，那么他们正在过滤你的请求只是用普通的用户代理字符串尝试了一下，仍然不起作用，没有卷曲错误我认为@Maks3w希望你在浏览器中尝试一下，以确保网站确实有效。是的，在浏览器中试过，效果很好。在我的小应用程序中，我甚至有另一个curl请求获取请求页面的标题，这很好…不，请求没有超时，它返回我下载的完整大小和下载时间