Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 尝试重新启动webscraper的线程时连接被拒绝_Python_Multithreading_Python 2.7_Webkit - Fatal编程技术网

Python 尝试重新启动webscraper的线程时连接被拒绝

Python 尝试重新启动webscraper的线程时连接被拒绝,python,multithreading,python-2.7,webkit,Python,Multithreading,Python 2.7,Webkit,我正在使用DrySrape来刮取javascript页面,偶尔如果出现错误,它会终止进程。我已经尝试根据文档使用catch来防止它,但我还没有找到答案: try: sess.visit('url')) except webkit_server.EndOfStreamError: continue except webkit_server.NoResponeerror: cont

我正在使用DrySrape来刮取javascript页面,偶尔如果出现错误,它会终止进程。我已经尝试根据文档使用catch来防止它,但我还没有找到答案:

        try:
            sess.visit('url'))
        except webkit_server.EndOfStreamError:
            continue
        except webkit_server.NoResponeerror:
            continue
        except webkit_server.InvalidResponseError:
            continue
        except webkit_server.NoX11Error:
            continue
因此,我有这样的设置,可以在线程崩溃时重新启动线程: 类检查器(): def检查线程是否处于活动状态(自身): a=线程类() a、 开始()

但是,每当我尝试重新启动线程时,都会出现错误:

Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "Scrapper.py", line 30, in run
    sess = dryscrape.Session(base_url = 'url')
  File "/usr/local/lib/python2.7/dist-packages/dryscrape/session.py", line 18, in __init__
    self.driver = driver or DefaultDriver()
  File "/usr/local/lib/python2.7/dist-packages/dryscrape/driver/webkit.py", line 30, in __init__
    super(Driver, self).__init__(**kw)
  File "/usr/local/lib/python2.7/dist-packages/webkit_server.py", line 225, in __init__
    self.conn = connection or ServerConnection()
  File "/usr/local/lib/python2.7/dist-packages/webkit_server.py", line 444, in __init__
    self._sock = (server or get_default_server()).connect()
  File "/usr/local/lib/python2.7/dist-packages/webkit_server.py", line 414, in connect
    sock.connect(("127.0.0.1", self._port))
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused

有没有更好的方法来解决这个问题,或者我遗漏了什么?

因为:你正在尝试连接到你自己

need change target url.
如果想要连接到您自己,请首先创建服务

File "/usr/local/lib/python2.7/dist-packages/webkit_server.py", line 414, in connect
    sock.connect(("127.0.0.1", self._port))
  File "/usr/lib/python2.7/socket.py", line 224, in meth <<<--- you're trying to connect to yourself.
    return getattr(self._sock,name)(*args)
文件“/usr/local/lib/python2.7/dist packages/webkit_server.py”,第414行,在connect中
插座连接(((“127.0.0.1”,自身端口))

文件“/usr/lib/python2.7/socket.py”,第224行,在meth中,如果您想跳过异常,您可以始终使用像这样的捕获所有异常处理程序。这通常被认为是非常糟糕的做法,但如果错误只是偶尔发生,它确实会让您的刮板保持运行:

try:
    sess.visit(url)
except Exception as e:
    # Print the exception for debugging here
    continue
您是否正在启动本地服务器进行测试?从追溯中:

File "/usr/local/lib/python2.7/dist-packages/webkit_server.py", line 414, in connect
sock.connect(("127.0.0.1", self._port))
您实际上正在连接到本地主机。如果启动自己的服务器,请检查服务器日志以了解它停止响应连接请求的原因


刚刚发现脚本中的另一个错误:

sess.visit('url')
# it should be something like:
url = "http://www.google.com/"
sess.visit(url)
我敢打赌“url”是一个url字符串,但出于隐私原因它被更改了。显然,这是一个很糟糕的替代品,显然会引起混乱。
sess.visit('url')
# it should be something like:
url = "http://www.google.com/"
sess.visit(url)