Python Scrapy 404错误：在抓取网页时未处理或不允许使用HTTP状态代码_Python_Scrapy

Python Scrapy 404错误：在抓取网页时未处理或不允许使用HTTP状态代码

python scrapy

Python Scrapy 404错误：在抓取网页时未处理或不允许使用HTTP状态代码,python,scrapy,Python,Scrapy,我正在尝试使用Scrapy刮取subreddit，但是，每次运行spider时，我都会收到404错误 2020-01-07 12:21:46[scrapy.spidermiddleware.httperror]信息：忽略响应：HTTP状态代码未处理或不允许我当前使用的代码： import scrapy class RedditbotSpider(scrapy.Spider): name = 'redditbot' allowed_domains = ['www.reddit.

我正在尝试使用Scrapy刮取subreddit，但是，每次运行spider时，我都会收到404错误

2020-01-07 12:21:46[scrapy.spidermiddleware.httperror]信息：忽略响应：HTTP状态代码未处理或不允许

我当前使用的代码：

import scrapy 
class RedditbotSpider(scrapy.Spider):
    name = 'redditbot'
    allowed_domains = ['www.reddit.com/r/gameofthrones/']
    start_urls = ['http://www.reddit.com/r/gameofthrones//']

    def parse(self, response):
        #Extracting the content using css selectors
        titles = response.css('.title.may-blank::text').extract()
        votes = response.css('.score.unvoted::text').extract()
        times = response.css('time::attr(title)').extract()
        comments = response.css('.comments::text').extract()

        #Give the extracted content row wise
        for item in zip(titles,votes,times,comments):
            #create a dictionary to store the scraped info
            scraped_info = {
                'title' : item[0],
                'vote' : item[1],
                'created_at' : item[2],
                'comments' : item[3],
            }

            #yield or give the scraped info to scrapy
            yield scraped_info

在更改settings.py文件中的USER_代理后，我尝试重新运行，但我遇到了相同的问题。

检查您的URL<代码>http://www.reddit.com/r/gameofthrones//（您在发送请求时是否使用了标头我不是否，我应该如何执行此操作？