Php 如何使用cURL获取目标URL？_Php_Html_Http_Curl

Php 如何使用cURL获取目标URL？

php html http curl

Php 如何使用cURL获取目标URL？,php,html,http,curl,Php,Html,Http,Curl,当HTTP状态代码为302时，如何使用cURL获取目标URL <?PHP $url = "http://www.ecs.soton.ac.uk/news/"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $html = curl_exec($ch); $status_code = curl_getinfo($ch,CURLINFO_HT

当HTTP状态代码为302时，如何使用cURL获取目标URL

<?PHP
$url = "http://www.ecs.soton.ac.uk/news/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$html = curl_exec($ch);
$status_code = curl_getinfo($ch,CURLINFO_HTTP_CODE);

if($status_code=302 or $status_code=301){
  $url = "";
  // I want to to get the destination url
}
curl_close($ch);
?>

您必须获取重定向URL的位置标题。
302重定向列表的新目标位于http标题字段“位置”。例如：
用正则表达式grep就行了
要包含所有HTTP头信息，请使用curl选项CURLOPT_头将其包含到结果中。设置为：

curl_setopt($c, CURLOPT_HEADER, true);
如果您只是想让curl跟随重定向，请使用CURLOPT\u FOLLOWLOCATION：

curl_setopt($c, CURLOPT_FOLLOWLOCATION, true);

无论如何，您不应该使用新的URI，因为HTTP状态码302只是一个临时的重定向。
这里有一种方法可以获取curl HTTP请求返回的所有头，以及每个头的状态码和头行数组

$url = 'http://google.com'; $opts = array(CURLOPT_URL => $url, CURLOPT_RETURNTRANSFER => true, CURLOPT_HEADER => true, CURLOPT_FOLLOWLOCATION => true); $ch = curl_init(); curl_setopt_array($ch, $opts); $return = curl_exec($ch); curl_close($ch); $headers = http_response_headers($return); foreach ($headers as $header) { $str = http_response_code($header); $hdr_arr = http_response_header_lines($header); if (isset($hdr_arr['Location'])) { $str .= ' - Location: ' . $hdr_arr['Location']; } echo $str . '<br />'; } function http_response_headers($ret_str) { $hdrs = array(); $arr = explode("\r\n\r\n", $ret_str); foreach ($arr as $each) { if (substr($each, 0, 4) == 'HTTP') { $hdrs[] = $each; } } return $hdrs; } function http_response_header_lines($hdr_str) { $lines = explode("\n", $hdr_str); $hdr_arr['status_line'] = trim(array_shift($lines)); foreach ($lines as $line) { list($key, $val) = explode(':', $line, 2); $hdr_arr[trim($key)] = trim($val); } return $hdr_arr; } function http_response_code($str) { return substr(trim(strstr($str, ' ')), 0, 3); }

$url='1！'http://google.com'; $opts=array（CURLOPT_URL=>$URL， CURLOPT_RETURNTRANSFER=>true， CURLOPT_HEADER=>true， CURLOPT_FOLLOWLOCATION=>true）； $ch=curl_init（）； curl_setopt_数组（$ch，$opts）； $return=curl\u exec（$ch）；卷曲关闭（$ch）； $headers=http\u response\u头（$return）； foreach（$headers作为$header）{ $str=http_响应_代码（$header）； $hdr\u arr=http\u响应\u头\u行（$header）；如果（isset（$hdr_arr['Location']））{ $str.='-位置：'.$hdr_arr['Location']； } echo$str.“ ”； } 函数http_响应_头（$ret_str） { $hdrs=array（）； $arr=explode（“\r\n\r\n”，$ret\u str）； foreach（$arr，每个美元）{ if（substr（$each，0，4）='HTTP'）{ $hdrs[]=每个$hdrs； } } 返回$HDR； } 函数http\u响应\u头\u行（$hdr\u str） { $lines=explode（“\n”，$hdr\u str）； $hdr_arr['status_line']=修剪（数组移位（$line））； foreach（$line作为$line）{ 列表（$key，$val）=分解（“：”，$line，2）； $hdr_arr[修剪（$key）]=修剪（$val）； } 返回$hdr_arr； } 函数http_响应_代码（$str） { 返回substr（trim（strstrstr（$str'）），0,3； }
使用
curl\u getinfo（$ch）
，第一个元素（
url
）将指示有效的url。
您可以使用：

echo curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);

回复有点过时，但想展示一个完整的工作示例，一些解决方案包括：

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); //set url curl_setopt($ch, CURLOPT_HEADER, true); //get header curl_setopt($ch, CURLOPT_NOBODY, true); //do not include response body curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //do not show in browser the response curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); //follow any redirects curl_exec($ch); $new_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL); //extract the url from the header response curl_close($ch);

这适用于任何重定向，例如301或302，但是在404上，它将只返回请求的原始url（因为找不到它）。这可用于更新或删除站点中的链接。无论如何，这是我的需要
针对用户437797对塔米克·索齐耶夫回答的评论（不幸的是，我没有直接评论的声誉）：

CURLINFO\u有效的\u URL工作正常，但要让它按照op的要求运行，当然还必须将CURLOPT\u FOLLOWLOCATION设置为TRUE。这是因为CURLINFO\u EFFECTIVE\u URL返回的是它所说的内容，即最终加载的有效URL。如果您不遵循重定向，那么这将是您请求的url，如果您遵循重定向，那么它将是重定向到的最终url
这种方法的优点是它还可以处理多个重定向，而当您自己检索和解析HTTP头时，您可能需要多次这样做，然后才能公开最终的目标url

还请注意，curl遵循的最大重定向数可以通过CURLOPT_MAXREDIRS控制。默认情况下，它是无限制的（-1），但如果有人（可能是故意）为某个url配置了无休止的重定向循环，这可能会给您带来麻烦。
其他未解决的问题运气好吗？您应该接受正确的答案（-1），这需要进一步的麻烦，例如检查它是否是相对的，解决它（如果中间重定向中有多个等。pp），则可能是上一个基本URL），它更易于使用。此方法比从位置标题解析URL更干净/通常更好。CURLINFO\u EFFECTIVE\u URL为我返回当前（请求的）页面。没有重定向（位置：）curl\u getinfo结果中的url。似乎，解析标题是最佳实践…
CURLINFO\u EFFECTIVE\u url
在某些情况下并不总是有效，尤其是在不使用标题重定向的情况下。对于获取当前（请求的）页面的用户，请在调用curl\u exec（$ch）后使用此代码；很好，比我自己解析要干净得多。感谢分享！完美！感谢分享，如果没有位置标题？有时网站会使用元重定向或
window.location.replace
重定向页面。在这种情况下，请替换正则表达式以捕获结果。
$ch = curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt($ch, CURLOPT_HEADER, TRUE); // We'll parse redirect url from header. curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE); // We want to just get redirect url but not to follow it. $response = curl_exec($ch); preg_match_all('/^Location:(.*)$/mi', $response, $matches); curl_close($ch); echo !empty($matches[1]) ? trim($matches[1][0]) : 'No redirect found';

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); //set url curl_setopt($ch, CURLOPT_HEADER, true); //get header curl_setopt($ch, CURLOPT_NOBODY, true); //do not include response body curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //do not show in browser the response curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); //follow any redirects curl_exec($ch); $new_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL); //extract the url from the header response curl_close($ch);