PHP:使用Curl验证URL

PHP:使用Curl验证URL,php,curl,Php,Curl,这是我的问题。当我使用上述命令并设置$fileSource=”时http://google.com";它不工作,而如果我将其设置为$fileSource=”http://www.google.com/";它可以工作 有什么问题吗?试着明确地告诉curl遵循重定向 $fileSource = "http://google.com"; $ch = curl_init($fileSource); curl_setopt($ch, CURLOPT_NOBODY, true);

这是我的问题。当我使用上述命令并设置
$fileSource=”时http://google.com";它不工作,而如果我将其设置为
$fileSource=”http://www.google.com/";它可以工作


有什么问题吗?

试着明确地告诉curl遵循重定向

    $fileSource = "http://google.com";
    $ch = curl_init($fileSource);
    curl_setopt($ch, CURLOPT_NOBODY, true);
    curl_exec($ch);
    $retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    if ($retcode != 200) {
        $error .= "The source specified is not a valid URL.";
    }
    curl_close($ch);
如果这不起作用,你可能需要欺骗一些网站上的用户代理


此外,如果他们正在使用JS,那么你的重定向就不走运了。

请明确告诉curl遵循重定向

    $fileSource = "http://google.com";
    $ch = curl_init($fileSource);
    curl_setopt($ch, CURLOPT_NOBODY, true);
    curl_exec($ch);
    $retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    if ($retcode != 200) {
        $error .= "The source specified is not a valid URL.";
    }
    curl_close($ch);
如果这不起作用,你可能需要欺骗一些网站上的用户代理


此外,如果他们使用JS重定向,你的运气就不好了。

你看到的实际上是301重定向的结果。下面是我从命令行使用详细的curl得到的结果

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl-vvhttp://google.com
*即将连接()到google.com端口80(#0)
*正在尝试173.194.43.34。。。
*连接的
*已连接到google.com(173.194.43.34)端口80(#0)
>GET/HTTP/1.1
>用户代理:curl/7.25.0(x86_64-apple-darwin11.3.0)libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
>主持人:google.com
>接受:*/*
> 
但是,如果您在301重定向中建议的实际www.google.com上进行卷曲,您将得到以下结果

curl -vvvvvv http://google.com
* About to connect() to google.com port 80 (#0)
*   Trying 173.194.43.34...
* connected
* Connected to google.com (173.194.43.34) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: google.com
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
< Location: http://www.google.com/
< Content-Type: text/html; charset=UTF-8
< Date: Fri, 04 May 2012 04:03:59 GMT
< Expires: Sun, 03 Jun 2012 04:03:59 GMT
< Cache-Control: public, max-age=2592000
< Server: gws
< Content-Length: 219
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< 
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host google.com left intact
* Closing connection #0
curl-vvhttp://www.google.com
*即将连接()到www.google.com端口80(#0)
*正在尝试74.125.228.19。。。
*连接的
*已连接到www.google.com(74.125.228.19)端口80(#0)
>GET/HTTP/1.1
>用户代理:curl/7.25.0(x86_64-apple-darwin11.3.0)libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
>主持人:www.google.com
>接受:*/*
> 

我截短了谷歌剩下的回复,只是为了显示200 OK与301重定向的主要区别,你看到的实际上是301重定向的结果。下面是我从命令行使用详细的curl得到的结果

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl-vvhttp://google.com
*即将连接()到google.com端口80(#0)
*正在尝试173.194.43.34。。。
*连接的
*已连接到google.com(173.194.43.34)端口80(#0)
>GET/HTTP/1.1
>用户代理:curl/7.25.0(x86_64-apple-darwin11.3.0)libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
>主持人:google.com
>接受:*/*
> 
但是,如果您在301重定向中建议的实际www.google.com上进行卷曲,您将得到以下结果

curl -vvvvvv http://google.com
* About to connect() to google.com port 80 (#0)
*   Trying 173.194.43.34...
* connected
* Connected to google.com (173.194.43.34) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: google.com
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
< Location: http://www.google.com/
< Content-Type: text/html; charset=UTF-8
< Date: Fri, 04 May 2012 04:03:59 GMT
< Expires: Sun, 03 Jun 2012 04:03:59 GMT
< Cache-Control: public, max-age=2592000
< Server: gws
< Content-Length: 219
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< 
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host google.com left intact
* Closing connection #0
curl-vvhttp://www.google.com
*即将连接()到www.google.com端口80(#0)
*正在尝试74.125.228.19。。。
*连接的
*已连接到www.google.com(74.125.228.19)端口80(#0)
>GET/HTTP/1.1
>用户代理:curl/7.25.0(x86_64-apple-darwin11.3.0)libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
>主持人:www.google.com
>接受:*/*
> 
我截短了谷歌回复的剩余部分,只是为了显示200 OK与301重定向的主要区别,一个永久重定向(301)到
www.
域,而另一个只回复OK(200)

为什么你只认为200状态码是有效的?让CURL为您解决这个问题:

curl -vvvvvv http://www.google.com
* About to connect() to www.google.com port 80 (#0)
*   Trying 74.125.228.19...
* connected
* Connected to www.google.com (74.125.228.19) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: www.google.com
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Fri, 04 May 2012 04:05:25 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
从:

如果返回的HTTP代码大于或,则为TRUE,将以静默方式失败 等于400。默认行为是正常返回页面, 忽略代码

一个永久重定向(301)到
www.
域,而另一个只回复OK(200)

为什么你只认为200状态码是有效的?让CURL为您解决这个问题:

curl -vvvvvv http://www.google.com
* About to connect() to www.google.com port 80 (#0)
*   Trying 74.125.228.19...
* connected
* Connected to www.google.com (74.125.228.19) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: www.google.com
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Fri, 04 May 2012 04:05:25 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
从:

如果返回的HTTP代码大于或,则为TRUE,将以静默方式失败 等于400。默认行为是正常返回页面, 忽略代码