Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/typo3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
值位于数组中时返回false的PHP数组搜索_Php_Http_Web Scraping - Fatal编程技术网

值位于数组中时返回false的PHP数组搜索

值位于数组中时返回false的PHP数组搜索,php,http,web-scraping,Php,Http,Web Scraping,我正在写一个URL刮刀(只是名称和描述),并试图处理301重定向 现在,我检查标题,如果不是200,我尝试在标题中找到要重定向到的位置。我的问题出现了,因为数组_搜索不会返回位置值所在的键,尽管我在那里看到了它 这是代码片段: if(strpos($url_headers[0], "200") !== false){ echo "in here"; return $url; }else{ print_r($url_headers);

我正在写一个URL刮刀(只是名称和描述),并试图处理301重定向

现在,我检查标题,如果不是200,我尝试在标题中找到要重定向到的位置。我的问题出现了,因为数组_搜索不会返回位置值所在的键,尽管我在那里看到了它

这是代码片段:

if(strpos($url_headers[0], "200") !== false){
        echo "in here";
        return $url;
    }else{
        print_r($url_headers);
        //look for location
        $location_key = array_search("Location: ", $url_headers);
        echo "Location Key: " . $location_key;
        $redirect_string = $url_headers[$location_key];
        $clean_url = str_replace("Location: ", "", $redirect_string);
        return $clean_url;
    }
其输出为:

Array ( [0] => HTTP/1.0 301 Moved Permanently [1] => Location: http://www.google.com/ [2] => Content-Type: text/html; charset=UTF-8 [3] => Date: Wed, 13 Feb 2013 03:30:00 GMT [4] => Expires: Fri, 15 Mar 2013 03:30:00 GMT [5] => Cache-Control: public, max-age=2592000 [6] => Server: gws [7] => Content-Length: 219 [8] => X-XSS-Protection: 1; mode=block [9] => X-Frame-Options: SAMEORIGIN [10] => HTTP/1.0 200 OK [11] => Date: Wed, 13 Feb 2013 03:30:00 GMT [12] => Expires: -1 [13] => Cache-Control: private, max-age=0 [14] => Content-Type: text/html; charset=ISO-8859-1 [15] => Set-Cookie: PREF=ID=fe86e29432d4e240:FF=0:TM=1360726200:LM=1360726200:S=Wg8VEU7kc7UtcKc-; expires=Fri, 13-Feb-2015 03:30:00 GMT; path=/; domain=.google.com [16] => Set-Cookie: NID=67=KH8Zu8EpKjrhje8nD0lk_868mqvQr9pGwsAsaUuPDD_PRUgohJHoOkdlyYEHWmohUtndyENDJ0oZq8pC1aqOg20anXpUn5btQX5GYM6kYlgMhYxIPajtGp9KymmMDO1Y; expires=Thu, 15-Aug-2013 03:30:00 GMT; path=/; domain=.google.com; HttpOnly [17] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info." [18] => Server: gws [19] => X-XSS-Protection: 1; mode=block [20] => X-Frame-Options: SAMEORIGIN ) Location Key: {"error":"invalid_url","error_code":null}
我做错了什么?在抓取用户提供的链接时,是否有更优雅的方法来处理重定向?

如果没有找到匹配项,则返回false,因此需要执行此操作

$url_headers[0] = 'HTTP/1.0 200';
if(strpos($url_headers[0], "200") > 0){
  echo "here";
} else {
  //look for location
   $location_key = getLocation($url_headers);
   echo "Location Key: " . $location_key;
}

function getLocation($data) {
    $url = false;
    foreach($data as $key => $value) {
        if (preg_match("/Location:/", $value)) {
             echo "A match was found.";
            //$url = $matches[1];
            $url = $data[$key];
            break;
        }
    }
    return $url;
}
if( ! strpos($url_headers[0], "200"))
这在很大程度上是可行的,但是重定向到https仍然有困难(出于某种原因,他们需要双重重定向?)


(via)

curl没有自动重定向头解析机制吗?尝试过实现curl,如果有自动重定向机制,它就不会生效:请看,问题是数组搜索没有找到“位置:”的关键位置,尽管在前几行中,它被证明具有该值包含“Location:”的数组键在不同的网站之间变化,这是我第一次尝试,然后我继续尝试数组搜索以捕获适当的值,无论它可能在哪里。请立即尝试更新,它将解决您的http和https问题
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HEADER, true);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
        $a = curl_exec($ch);
        if(preg_match('#Location: (.*)#', $a, $r)){
         $l = trim($r[1]);
         return $l;
        }else{
            return $url;
        }