Python scrapy脚本无法列出错误

Python scrapy脚本无法列出错误,python,scrapy,attributeerror,Python,Scrapy,Attributeerror,大家好 我有一个基于scrapy框架的简单解析器。以下是核心代码: #!/usr/bin/python #-*-coding:utf-8-*- import sys, os, logging from utils import append_project_to_python_path, load_spiders from scrapy.utils.log import configure_logging from scrapy.crawler import CrawlerProcess fro

大家好

我有一个基于scrapy框架的简单解析器。以下是核心代码:

#!/usr/bin/python
#-*-coding:utf-8-*-
import sys, os, logging
from utils import append_project_to_python_path, load_spiders
from scrapy.utils.log import configure_logging
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

PATH = os.path.dirname(os.path.realpath(sys.argv[0]))
append_project_to_python_path()
os.environ['DJANGO_SETTINGS_MODULE'] = 'delta_parser.settings'             #add django settings to the project
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', 'scrapy_parser.settings')  #add path to scrapy settings to the project

# Settings for logging
configure_logging(install_root_handler=False)
logging.basicConfig(
    filename = PATH + '/output/delta_scraper.log',
    filemode = 'w+b',
    format = '%(asctime)s [%(name)s] %(levelname)s: %(message)s',
)

# Scrapy settings and spiders
settings = get_project_settings()
process = CrawlerProcess(settings)
spiders = load_spiders()
map(process.crawl, spiders)      # attachs spiders to the crawling process

# Commented block for testing chosen spiders from this script
#from scrapy_parser.spiders.company_revsite_spider import CompanyRevsiteSpider
#process.crawl(CompanyRevsiteSpider)

process.start() # the script will block here until all crawling jobs are finished
它从spiders文件夹中的spiders中加载约30个spider类,将它们附加到爬网进程,并在scrapy设置中使用中间件将项目写入数据库。这个简单的方案已经很好地工作了一段时间,但是现在我遇到了一些错误,比如CannotListError:无法侦听127.0.0.1:6073:[Errno 98]已经在使用的地址AttributeError:TelnetConsole实例没有属性“port”

我最近没有添加爬行器或更改设置。在运行脚本之前,没有使用127.0.0.1:6073的程序。任何帮助都将不胜感激

编辑

  • 刮痕1.0.3,扭曲15.4.0
  • 日志以开始(我不认为日志总是以同时初始化所有spider开始,实际上我要说的是,在过去的抓取过程中,它们是一个接一个地初始化的):
  • 。。。(一切正常)
    2016-04-11 06:20:04539[scrapy.telnet]调试:telnet控制台监听127.0.0.1:6072
    2016-04-11 06:20:04556[scrapy.middleware]信息:启用的扩展:CloseSpider、TelnetConsole、LogStats、CoreStats、SpiderState、AutoThrottle
    2016-04-11 06:20:04559[scrapy.middleware]信息:启用的下载程序中间件:HttpAuthMiddleware、DownloadTimeoutMiddleware、UserAgentMiddleware、RetryMiddleware、DefaultHeadersMiddleware、MetaRefreshMiddleware、HttpCompressionMiddleware、RedirectMiddleware、ChunkedTransferMiddleware、DownloadersStats
    2016-04-11 06:20:04560[剪贴簿中间件]信息:启用的蜘蛛中间件:HttpErrorMiddleware、OffsiteMiddleware、RefererMiddleware、UrlLengthMiddleware、DepthMiddleware
    2016-04-11 06:20:04561[scrapy.middleware]信息:启用的项目管道:ProcessItemFields、CsvExportPipeline、DBExportPipeline
    2016-04-11 06:20:04562[刮屑核心引擎]信息:蜘蛛网已打开
    2016-04-11 06:20:04563[scrapy.extensions.logstats]信息:爬网0页(每分钟0页),爬网0项(每分钟0项)
    2016-04-11 06:20:04574[scrapy.telnet]调试:telnet控制台监听127.0.0.1:6073
    2016-04-11 06:20:04623[scrapy.utils.signal]错误:发现错误 信号处理程序:>
    回溯(最近一次呼叫最后一次):
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/twisted/internet/defer.py”,第150行,格式为maybeDeferred
    结果=f(*参数,**kw)
    robustapply中的文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/scrapy/xlib/pydispatch/robustapply.py”,第57行
    返回接收者(*参数,**命名)
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/scrapy/telnet.py”,第56行,开始收听
    self.port=侦听\u tcp(self.portrange、self.host、self)
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/scrapy/utils/reactor.py”,第14行,在listen\u tcp
    返回反应器.listenTCP(x,工厂,接口=主机)
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/twisted/internet/posixbase.py”,第478行,在listenTCP中
    p、 startListening()
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/twisted/internet/tcp.py”,第984行,在startListening中
    raise CannotListError(self.interface,self.port,le)CannotListError:无法侦听127.0.0.1:6073:[Errno 98]地址已在使用中。 2016-04-11 06:20:04642[scrapy.中间件]信息:启用的扩展:CloseSpider、TelnetConsole、LogStats、CoreStats、SpiderState、AutoChrottle
    2016-04-11 06:20:04645[scrapy.middleware]信息:启用的下载中间件:HttpAuthMiddleware、DownloadTimeoutMiddleware、UserAgentMiddleware、RetryMiddleware、DefaultHeadersMiddleware、MetaRefreshMiddleware、HttpCompressionMiddleware、RedirectMiddleware、ChunkedTransferMiddleware、DownloadersStats
    2016-04-11 06:20:04647[剪贴簿中间件]信息:启用的蜘蛛中间件:HttpErrorMiddleware、OffsiteMiddleware、RefererMiddleware、UrlLengthMiddleware、DepthMiddleware
    2016-04-11 06:20:04648[scrapy.middleware]信息:启用的项目管道:ProcessItemFields、CsvExportPipeline、DBExportPipeline
    2016-04-11 06:20:04649[刮屑核心引擎]信息:蜘蛛网已打开
    2016-04-11 06:20:04650[scrapy.extensions.logstats]信息:爬网0页(0页/分钟),爬网0项(0项/分钟)
    2016-04-11 06:20:04665[scrapy.utils.signal]错误:在信号处理器上捕获错误:>
    回溯(最近一次呼叫最后一次):
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/twisted/internet/defer.py”,第150行,格式为maybeDeferred
    结果=f(*参数,**kw)
    robustapply中的文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/scrapy/xlib/pydispatch/robustapply.py”,第57行
    返回接收者(*参数,**命名)
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/scrapy/telnet.py”,第56行,开始收听
    self.port=侦听\u tcp(self.portrange、self.host、self)
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/scrapy/utils/reactor.py”,第14行,在listen\u tcp
    返回反应器.listenTCP(x,工厂,接口=主机)
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/twisted/internet/posixbase.py”,第478行,在listenTCP中
    p、 startListening()
    文件“/home/vagrant/.virtualenvs/big brother/local/lib/python2.7/site packages/twisted/internet/tcp.py”,第984行,在startListening中
    引发CannotListError(self.interface,self.port,le)CannotListError:无法侦听127.0.0.1:6073:[Errno 98]地址已在使用中。
    ... (更多类似上面的错误)
    ... (爬行器开始抓取返回项的页面,即解析