Python ganjoor网站的Colab刮板超时错误(连接超时。)

Python ganjoor网站的Colab刮板超时错误(连接超时。),python,web-scraping,scrapy,Python,Web Scraping,Scrapy,我已经尝试通过通过colab通过scrapy python模块从站点上刮取所有链接!刮壳https://ganjoor.net命令,它向我显示此错误(): 开放式Colab徽章 2020-09-21 01:36:32[scrapy.utils.log]信息:scrapy 2.3.0已启动(bot:scrapybot) 2020-09-21 01:36:32[scrapy.utils.log]信息:版本:lxml 4.2.6.0,libxml2.9.8,CSSELECT 1.1.0,parsel

我已经尝试通过
通过colab通过scrapy python模块从站点上刮取所有链接!刮壳https://ganjoor.net
命令,它向我显示此错误():

开放式Colab徽章

2020-09-21 01:36:32[scrapy.utils.log]信息:scrapy 2.3.0已启动(bot:scrapybot)
2020-09-21 01:36:32[scrapy.utils.log]信息:版本:lxml 4.2.6.0,libxml2.9.8,CSSELECT 1.1.0,parsel 1.6.0,w3lib 1.22.0,Twisted 20.3.0,Python 3.6.9(默认,2020年7月17日,12:50:27)-[GCC 8.4.0],pyOpenSSL 19.1.0(OpenSSL 1.1.1g,2020年4月21日),密码学3.1,平台Linux-4.19.112+-x8664-Ubuntu-18.04
2020-09-21 01:36:32[scrapy.utils.log]调试:使用reactor:twisted.internet.epollreactor.epollreactor
2020-09-21 01:36:32[抓取程序]信息:覆盖的设置:
{'DUPEFILTER_CLASS':'scrapy.dupefilters.BaseDupeFilter',
'LOGSTATS_INTERVAL':0}
2020-09-21 01:36:32[scrapy.extensions.telnet]信息:telnet密码:3fba9164de72a60c
2020-09-21 01:36:32[scrapy.middleware]信息:启用的扩展:
['scrapy.extensions.corestats.corestats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage']
2020-09-21 01:36:32[scrapy.middleware]信息:启用的下载程序中间件:
['scrapy.downloaderMiddleware.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloaderMiddleware.defaultheaders.DefaultHeadersMiddleware',
'scrapy.DownloaderMiddleware.useragent.UserAgentMiddleware',
'scrapy.DownloaderMiddleware.retry.RetryMiddleware',
'scrapy.DownloaderMiddleware.redirect.MetaRefreshMiddleware',
'scrapy.DownloaderMiddleware.httpcompression.HttpCompressionMiddleware',
'scrapy.DownloaderMiddleware.redirect.RedirectMiddleware',
“scrapy.DownloaderMiddleware.cookies.CookiesMiddleware”,
'scrapy.downloadermiddleware.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddleware.stats.DownloaderStats']
2020-09-21 01:36:32[scrapy.middleware]信息:启用的蜘蛛中间件:
['scrapy.spidermiddleware.httperror.httperror中间件',
'刮皮.SpiderMiddleware.场外.场外Iddleware',
“scrapy.Spidermiddleware.referer.RefererMiddleware”,
'scrapy.spiderMiddleware.urllength.UrlLengthMiddleware',
'scrapy.spidermiddleware.depth.DepthMiddleware']
2020-09-21 01:36:32[scrapy.middleware]信息:启用的项目管道:
[]
2020-09-21 01:36:32[scrapy.extensions.telnet]信息:telnet控制台监听127.0.0.1:6023
2020-09-21 01:36:32[刮屑.堆芯.发动机]信息:蜘蛛网已打开
2020-09-21 01:37:04[scrapy.downloadermiddleware.retry]调试:重试(失败1次):TCP连接超时:110:连接超时。
2020-09-21 01:37:36[scrapy.downloadermiddleware.retry]调试:重试(失败2次):TCP连接超时:110:连接超时。
2020-09-21 01:38:08[scrapy.DownloaderMiddleware.retry]错误:放弃重试(失败3次):TCP连接超时:110:连接超时。
回溯(最近一次呼叫最后一次):
文件“/usr/local/bin/scrapy”,第8行,在
sys.exit(execute())
文件“/usr/local/lib/python3.6/dist-packages/scrapy/cmdline.py”,执行中的第145行
_运行\u打印\u帮助(解析器、\u运行\u命令、cmd、args、opts)
文件“/usr/local/lib/python3.6/dist packages/scrapy/cmdline.py”,第100行,在“运行”和“打印”帮助中
func(*a,**千瓦)
文件“/usr/local/lib/python3.6/dist packages/scrapy/cmdline.py”,第153行,在_run_命令中
cmd.run(参数、选项)
文件“/usr/local/lib/python3.6/dist-packages/scrapy/commands/shell.py”,第74行,正在运行
shell.start(url=url,redirect=notopts.no\u redirect)
文件“/usr/local/lib/python3.6/dist-packages/scrapy/shell.py”,第43行,开头
self.fetch(url,spider,redirect=redirect)
文件“/usr/local/lib/python3.6/dist packages/scrapy/shell.py”,第111行,在fetch中
反应堆,自身(计划,请求,蜘蛛)
文件“/usr/local/lib/python3.6/dist-packages/twisted/internet/threads.py”,第122行,位于blockingCallFromThread中
result.raiseException()
raiseException中的文件“/usr/local/lib/python3.6/dist-packages/twisted/python/failure.py”,第488行
通过回溯提高自我价值(self.tb)
twisted.internet.error.TCPTimedOutError:TCP连接超时:110:连接超时。
我以前也问过,我无法通过colab的ganjoor站点,正如您可以通过上面的colab笔记本看到的那样

但是其他网站的抓取,比如通过colab进行的工作,当我在我的电脑上运行抓取命令时,它就开始工作了

如果你能帮我解决这个问题,我将不胜感激

谢谢

2020-09-21 01:36:32 [scrapy.utils.log] INFO: Scrapy 2.3.0 started (bot: scrapybot)
2020-09-21 01:36:32 [scrapy.utils.log] INFO: Versions: lxml 4.2.6.0, libxml2 2.9.8, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.6.9 (default, Jul 17 2020, 12:50:27) - [GCC 8.4.0], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g  21 Apr 2020), cryptography 3.1, Platform Linux-4.19.112+-x86_64-with-Ubuntu-18.04-bionic
2020-09-21 01:36:32 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2020-09-21 01:36:32 [scrapy.crawler] INFO: Overridden settings:
{'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter',
 'LOGSTATS_INTERVAL': 0}
2020-09-21 01:36:32 [scrapy.extensions.telnet] INFO: Telnet Password: 3fba9164de72a60c
2020-09-21 01:36:32 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage']
2020-09-21 01:36:32 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-09-21 01:36:32 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-09-21 01:36:32 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2020-09-21 01:36:32 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-09-21 01:36:32 [scrapy.core.engine] INFO: Spider opened
2020-09-21 01:37:04 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://ganjoor.net> (failed 1 times): TCP connection timed out: 110: Connection timed out.
2020-09-21 01:37:36 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://ganjoor.net> (failed 2 times): TCP connection timed out: 110: Connection timed out.
2020-09-21 01:38:08 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://ganjoor.net> (failed 3 times): TCP connection timed out: 110: Connection timed out.
Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 8, in <module>
    sys.exit(execute())
  File "/usr/local/lib/python3.6/dist-packages/scrapy/cmdline.py", line 145, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python3.6/dist-packages/scrapy/cmdline.py", line 100, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python3.6/dist-packages/scrapy/cmdline.py", line 153, in _run_command
    cmd.run(args, opts)
  File "/usr/local/lib/python3.6/dist-packages/scrapy/commands/shell.py", line 74, in run
    shell.start(url=url, redirect=not opts.no_redirect)
  File "/usr/local/lib/python3.6/dist-packages/scrapy/shell.py", line 43, in start
    self.fetch(url, spider, redirect=redirect)
  File "/usr/local/lib/python3.6/dist-packages/scrapy/shell.py", line 111, in fetch
    reactor, self._schedule, request, spider)
  File "/usr/local/lib/python3.6/dist-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
    result.raiseException()
  File "/usr/local/lib/python3.6/dist-packages/twisted/python/failure.py", line 488, in raiseException
    raise self.value.with_traceback(self.tb)
twisted.internet.error.TCPTimedOutError: TCP connection timed out: 110: Connection timed out.