Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/283.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php cUrl获取包含“的url的内容”ü&引用;U+;00FC%c3%bc_Php_Curl_Encoding - Fatal编程技术网

Php cUrl获取包含“的url的内容”ü&引用;U+;00FC%c3%bc

Php cUrl获取包含“的url的内容”ü&引用;U+;00FC%c3%bc,php,curl,encoding,Php,Curl,Encoding,我试图获得有关食品、名称、图片、价格等信息 所有其他URL工作正常,cUrl响应完全符合预期 我遇到的问题是,url包含重音拉丁/非标准url/非英语字符,如u或è 我已经尝试了我能想到的一切,但可能有一个简单的解决方案我错过了: stringtest.php?url=http://www.sainsburys.co.uk/shop/gb/groceries/desserts/g%C3%BC-lemon-pots-3x45g stringtest.php?url=http://www.sains

我试图获得有关食品、名称、图片、价格等信息

所有其他URL工作正常,cUrl响应完全符合预期

我遇到的问题是,url包含重音拉丁/非标准url/非英语字符,如u或è

我已经尝试了我能想到的一切,但可能有一个简单的解决方案我错过了:

stringtest.php?url=http://www.sainsburys.co.uk/shop/gb/groceries/desserts/g%C3%BC-lemon-pots-3x45g
stringtest.php?url=http://www.sainsburys.co.uk/shop/gb/groceries/desserts/gü-lemon-pots-3x45g
stringtest.php?url=http%3A%2F%2Fwww.sainsburys.co.uk%2Fshop%2Fgb%2Fgroceries%2Fdesserts%2Fg%C3%BC-lemon-pots-3x45g
这是我测试cUrl的代码:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
  </head>
  <body>
<?php
  $url = $_GET['url'];

  echo curlUrl($url);

  function curlUrl($url){
    $ch = curl_init();
    $timeout = 5;
    $cookie_file = "/tmp/cookie/cookie1.txt";
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    $html = curl_exec($ch);
    curl_close($ch);

    return $html;
  }

?>
  <form action="stringtest.php" method="get" id="process">
    <input type="text" name="url" placeholder="Url" autofocus>
    <input type="submit">
  </form>
  </body>
</html>

我从cUrl得到的结果是Sainsburys的404页,声称找不到该页。 从url栏复制/gü-lemon-pots-3x45g会导致url编码的ü(%C3%BC)版本被复制,正如预期的那样。在浏览器中输入URL时,ü和%C3%BC都可以用于访问实际的产品页面,那么为什么Sainsburys在使用cUrl时返回404


我尝试过各种方法,例如使用浏览器使用的确切标题,如
urldecode()
,但没有效果。

似乎是塞恩斯伯里网站本身的问题

如果您不发送有效的cookie,服务器将返回404

你试过重新加载吗

我试过了

stringtest.php?url=http://www.sainsburys.co.uk/shop/gb/groceries/desserts/gü-chocolate-ganache-pots-3x45g
它与有效的cookie一起工作。

如果您尝试:

wget http://www.sainsburys.co.uk/shop/gb/groceries/desserts/g%C3%BC-lemon-pots-3x45g
答复是:

http://www.sainsburys.co.uk/shop/gb/groceries/bakery
Resolving www.sainsburys.co.uk (www.sainsburys.co.uk)... 109.94.142.1
Connecting to www.sainsburys.co.uk (www.sainsburys.co.uk)|109.94.142.1|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.sainsburys.co.uk/webapp/wcs/stores/servlet/gb/groceries/bakery?langId=44&storeId=10151&krypto=xbYM3SJja%2F1mDOxJIVlKl9vZN6zjdlTL4MSiHOKiUMQoum9OkLwoTv6wj27CjUXwqM4%2BsteXag0O%0AQOWiHuS8onFdmoVLWlJyZ7hXaMhcMW9MIMMAsnPdWTPEzSEnOP5a&ddkey=http:AjaxAutoCompleteDisplayView [following]
--2014-10-07 11:56:11-- http://www.sainsburys.co.uk/webapp/wcs/stores/servlet/gb/groceries/bakery?langId=44&storeId=10151&krypto=xbYM3SJja%2F1mDOxJIVlKl9vZN6zjdlTL4MSiHOKiUMQoum9OkLwoTv6wj27CjUXwqM4%2BsteXag0O%0AQOWiHuS8onFdmoVLWlJyZ7hXaMhcMW9MIMMAsnPdWTPEzSEnOP5a&ddkey=http:AjaxAutoCompleteDisplayView
Reusing existing connection to www.sainsburys.co.uk:80.
HTTP request sent, awaiting response... 200 OK
要遵循curl中的重定向,请使用-L标志:

curl -L http://www.sainsburys.co.uk/shop/gb/groceries/desserts/g%C3%BC-lemon-pots-3x45g

感谢您的回复,当我从浏览器请求中包含cookie信息时,它的响应与预期一致。这似乎很奇怪,不需要cookie,但URL像一个有问题的!