Python 为什么错误信息可以';不能记录到指定的文件中吗?
平台:Debian8+Python3.4+Scrapy1.3.2 这是我的蜘蛛从yahoo.com下载的一些URLPython 为什么错误信息可以';不能记录到指定的文件中吗?,python,scrapy,Python,Scrapy,平台:Debian8+Python3.4+Scrapy1.3.2 这是我的蜘蛛从yahoo.com下载的一些URL import scrapy import csv class TestSpider(scrapy.Spider): name = "quote" allowed_domains = ["yahoo.com"] start_urls = ['url1','url2','url3',,,,'urls100'] def parse(sel
import scrapy
import csv
class TestSpider(scrapy.Spider):
name = "quote"
allowed_domains = ["yahoo.com"]
start_urls = ['url1','url2','url3',,,,'urls100']
def parse(self, response):
filename = response.url.split("=")[1]
open('/tmp/'+filename+'.csv', 'wb').write(response.body)
执行时会出现一些错误信息:
2017-02-19 21:28:27 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response
<404 https://chart.yahoo.com/table.csv?s=GLU>: HTTP status code is not handled or not allowed
为什么会显示错误信息,如2017-02-19 21:28:27[scrapy.spidermiddleware.httperror]信息:忽略响应 :未处理或不允许HTTP状态代码 无法记录在/home/log.txt中 想一想eLRuLL,我添加了handle\u httpstatus\u list=[404]
import scrapy
import csv
import logging
from scrapy.utils.log import configure_logging
configure_logging(install_root_handler=False)
logging.basicConfig(
filename='/home/log.txt',
format='%(levelname)s: %(message)s',
level=logging.INFO
)
class TestSpider(scrapy.Spider):
handle_httpstatus_list = [404]
name = "quote"
allowed_domains = ["yahoo.com"]
start_urls = ['url1','url2','url3',,,,'url100']
def parse(self, response):
filename = response.url.split("=")[1]
open('/tmp/'+filename+'.csv', 'wb').write(response.body)
错误信息仍然无法记录到/home/log.txt文件中,为什么 使用spider上的
handle\u httpstatus\u列表
属性来处理404
状态:
class TestSpider(scrapy.Spider):
handle_httpstatus_list = [404]
class TestSpider(scrapy.Spider):
handle_httpstatus_list = [404]