Java Can';无法从http响应获取设置Cookie头

Java Can';无法从http响应获取设置Cookie头,java,cookies,http-headers,web-scraping,response-headers,Java,Cookies,Http Headers,Web Scraping,Response Headers,我正在开发一个小型web内容刮板。代码的一部分是发送http请求并从响应头获取cookie,以便可以在后续请求中进行设置。 获取cookies的代码如下所示: HttpClient client = HttpClientBuilder.create().build(); HttpGet request = new HttpGet(url); request.setHeader("Accept", "text/html,application

我正在开发一个小型web内容刮板。代码的一部分是发送http请求并从响应头获取cookie,以便可以在后续请求中进行设置。 获取cookies的代码如下所示:

    HttpClient client = HttpClientBuilder.create().build();
    HttpGet request = new HttpGet(url);

    request.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");

    request.setHeader("Accept-Encoding","gzip,deflate,sdch");

    if(cookie!=null)
    {
      request.setHeader("Cookie", cookie);
    }
         
    request.setHeader("Accept-Language","en-US,en;q=0.8,zh-CN;q=0.6");    
    request.setHeader("Cache-Control", "max-age=0");
    request.setHeader("Connetion", "keep-alive");
    request.setHeader("Host", "www.booking.com");
    request.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) 
           AppleWebKit/537.36 (KHTML, like Gecko) 
           Chrome/32.0.1700.76 Safari/537.36");
    
    try {

        HttpResponse response = client.execute(request);
        int statusCode = response.getStatusLine().getStatusCode();
        System.out.println(statusCode);
        //get all headers       
        Header[] headers = response.getAllHeaders();
        for (Header header : headers) {
            System.out.println("Key : " + header.getName() 
                  + " ,Value : " + header.getValue());
        }
     
    

        System.out.println("----------------------------------------------------------");
        
    } catch (HttpException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
200

Key : Server ,Value : nginx

Key : Date ,Value : Mon, 03 Feb 2014 05:15:41 GMT

Key : Content-Type ,Value : text/html; charset=UTF-8

Key : Connection ,Value : keep-alive

Key : Cache-Control ,Value : private

Key : Vary ,Value : User-Agent, Accept-Encoding

Key : Set-Cookie ,Value : bkng=11UmFuZG9tSVYkc2RlIyh9YdMHS7ByVcpJ6zdHwCKMHsY37i1DyVPCutMoSY%2F9OR7ixF74JFUj1%2BJ3pF8ntbVX55kLQJvNnfE6Qco2NDwnHPzomws7z40vIxLRgwBTWU9CTbAN3zZqJGksaPN3GqHpSWJ%2BMIKlI5hQN6ZcJnKsU3rR9KXmRVS4plyPQf4gqmsjR131%2BtuuBiULzmDsKzejJZg%2BFgWWUOWS71bCxUGvJbeBBo1HRmUVmigKDEyHylYplnhKkriMof25dYccWyLQoBjIyUL4QZWr58O5D7fKPHDYWSY9y7k%2Bxfk7irIsyKdu%2B0owjpGp2%2BncNdphtqPZqdpeCyky1ReSjWVQ4QuZemceNGmfZGwxm%2BQxu0%2BkBEsJA5zY%2BoqulR8MJIBKZpFqsuvbeDZ9r5UJzl5c%2Fqk7Vw5YU1I%2FQunbw7PHra7IaGp6%2BmHnH2%2BeyiMDhAjWL769ebuwG2DhrgfB6eI0AGZE%2F6T0uA4j7bxA%2FwUdhog6yOu%2FSeTkPl%2FTAiIetVyKLfT1949ggWKfk1kGzmjnowOlZzPbxr1L%2FAifBjInWZ6DreY1Mr2A3%2BfjFYaHJYnS8VpB%2BZappBpGXBUVfHe%2FQ7lbDwNd6TCCzigpsb17LtvFYsb3JiZ%2BQFF82ILNwWFKz6B1xxEEbCRVoq8N%2FcXXPStyGSwApHZz%2Bew6LNI7Hkd2rjB1w3HenUXprZWR3XiWIWYyhMAbkaFbiQV2LThkl2Dkl%2FA%3D; domain=.booking.com; path=/; expires=Sat, 02-Feb-2019 05:15:41 GMT; HTTPOnly

Key : X-Recruiting ,Value : Like HTTP headers? Come write ours: booking.com/jobs
我用来测试的url是

打印的结果如下所示:

    HttpClient client = HttpClientBuilder.create().build();
    HttpGet request = new HttpGet(url);

    request.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");

    request.setHeader("Accept-Encoding","gzip,deflate,sdch");

    if(cookie!=null)
    {
      request.setHeader("Cookie", cookie);
    }
         
    request.setHeader("Accept-Language","en-US,en;q=0.8,zh-CN;q=0.6");    
    request.setHeader("Cache-Control", "max-age=0");
    request.setHeader("Connetion", "keep-alive");
    request.setHeader("Host", "www.booking.com");
    request.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) 
           AppleWebKit/537.36 (KHTML, like Gecko) 
           Chrome/32.0.1700.76 Safari/537.36");
    
    try {

        HttpResponse response = client.execute(request);
        int statusCode = response.getStatusLine().getStatusCode();
        System.out.println(statusCode);
        //get all headers       
        Header[] headers = response.getAllHeaders();
        for (Header header : headers) {
            System.out.println("Key : " + header.getName() 
                  + " ,Value : " + header.getValue());
        }
     
    

        System.out.println("----------------------------------------------------------");
        
    } catch (HttpException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
200

Key : Server ,Value : nginx

Key : Date ,Value : Mon, 03 Feb 2014 05:15:41 GMT

Key : Content-Type ,Value : text/html; charset=UTF-8

Key : Connection ,Value : keep-alive

Key : Cache-Control ,Value : private

Key : Vary ,Value : User-Agent, Accept-Encoding

Key : Set-Cookie ,Value : bkng=11UmFuZG9tSVYkc2RlIyh9YdMHS7ByVcpJ6zdHwCKMHsY37i1DyVPCutMoSY%2F9OR7ixF74JFUj1%2BJ3pF8ntbVX55kLQJvNnfE6Qco2NDwnHPzomws7z40vIxLRgwBTWU9CTbAN3zZqJGksaPN3GqHpSWJ%2BMIKlI5hQN6ZcJnKsU3rR9KXmRVS4plyPQf4gqmsjR131%2BtuuBiULzmDsKzejJZg%2BFgWWUOWS71bCxUGvJbeBBo1HRmUVmigKDEyHylYplnhKkriMof25dYccWyLQoBjIyUL4QZWr58O5D7fKPHDYWSY9y7k%2Bxfk7irIsyKdu%2B0owjpGp2%2BncNdphtqPZqdpeCyky1ReSjWVQ4QuZemceNGmfZGwxm%2BQxu0%2BkBEsJA5zY%2BoqulR8MJIBKZpFqsuvbeDZ9r5UJzl5c%2Fqk7Vw5YU1I%2FQunbw7PHra7IaGp6%2BmHnH2%2BeyiMDhAjWL769ebuwG2DhrgfB6eI0AGZE%2F6T0uA4j7bxA%2FwUdhog6yOu%2FSeTkPl%2FTAiIetVyKLfT1949ggWKfk1kGzmjnowOlZzPbxr1L%2FAifBjInWZ6DreY1Mr2A3%2BfjFYaHJYnS8VpB%2BZappBpGXBUVfHe%2FQ7lbDwNd6TCCzigpsb17LtvFYsb3JiZ%2BQFF82ILNwWFKz6B1xxEEbCRVoq8N%2FcXXPStyGSwApHZz%2Bew6LNI7Hkd2rjB1w3HenUXprZWR3XiWIWYyhMAbkaFbiQV2LThkl2Dkl%2FA%3D; domain=.booking.com; path=/; expires=Sat, 02-Feb-2019 05:15:41 GMT; HTTPOnly

Key : X-Recruiting ,Value : Like HTTP headers? Come write ours: booking.com/jobs
然而,当我将这个小程序上传到我的服务器并运行它时,结果是:

200

Key : Server ,Value : nginx

Key : Date ,Value : Mon, 03 Feb 2014 05:14:14 GMT

Key : Content-Type ,Value : text/html; charset=UTF-8

Key : Connection ,Value : keep-alive

Key : Cache-Control ,Value : private

Key : Vary ,Value : User-Agent, Accept-Encoding

Key : X-Recruiting ,Value : Like HTTP headers? Come write ours: booking.com/jobs
Set Cookie头消失了,我对同一站点内其他内容页面的后续请求(应该由我请求的第一个页面中的
javascript
加载)都返回了400个错误,我猜这是因为Cookie丢失了。 我不知道为什么,我知道我的电脑和服务器之间的区别是:

  • 我的电脑运行的是Windows 7,实际上有一个Chrome浏览器,而服务器运行的是Linux,没有任何实际的浏览器
  • ip地址不同。 除了这些,我还想不出其他的

  • 如有任何解决此问题的建议,我们将不胜感激。谢谢。

    请使用实际发送的HTTP请求更新您的问题。我认为你想要发送的和实际发送的有一些区别。您可以使用诸如ngrep或fiddler之类的工具捕获请求。@Majid L因为我使用的是云服务器,所以无法获取服务器发出的“实际”HTTP请求。fiddler将只获取虚拟服务器发送的请求,而请求正是我在问题中包含的内容。UserName假设不是我,而是@npcode:)