Python `sqlalchemy.exc.OperationalError:(psycopg2.OperationalError)SSL系统调用错误:在Ubuntu上检测到“EOF”,但在Mac OS上未检测到

Python `sqlalchemy.exc.OperationalError:(psycopg2.OperationalError)SSL系统调用错误:在Ubuntu上检测到“EOF”,但在Mac OS上未检测到,python,postgresql,sqlalchemy,ubuntu-18.04,libpq,Python,Postgresql,Sqlalchemy,Ubuntu 18.04,Libpq,我正在python3.7中使用asyncio和multiprocessing编写一个简单的web scraper。总体架构如下所示: for i in range(self.number_processes - len(processes)): p = Process(target=AsyncProcessWrapper().run_main_loop) time.sleep(0.3) p.start() start_time = time.time() cla

我正在
python3.7
中使用
asyncio
multiprocessing
编写一个简单的web scraper。总体架构如下所示:

for i in range(self.number_processes - len(processes)):
    p = Process(target=AsyncProcessWrapper().run_main_loop)
    time.sleep(0.3)
    p.start()
    start_time = time.time()
class Database:

    def __init__(self):
        db_url = os.getenv('DatabaseUrl')
        self.engine = create_engine(db_url, encoding='utf8', poolclass=NullPool)

        Session = sessionmaker()
        Session.configure(bind=self.engine)
        self.session = Session()

    def create_url_entity(self, urls):

        for url in urls:
            url_entity_obj = URLEntity(
                url=url,
                engine_version=self.engine_version
            )
            to_insert.append(url_entity_obj)

        self.session.bulk_save_objects(to_insert)

其中
AsyncProcessWrapper
定义为:

class AsyncProcessWrapper:

    def __init__(self):
        resource_database = Database()
        # This is where the async logic takes place, following a producer-consumer pattern.
        # I will hide this logic for simplicity.
        self.main = Main(database=resource_database)

    def run(self):
        asyncio.run(self.main.run_main_loop())
数据库连接只建立一次,从不关闭(因为这是一个刮板,需要一段时间才能下载数据)

数据库类init如下所示:

for i in range(self.number_processes - len(processes)):
    p = Process(target=AsyncProcessWrapper().run_main_loop)
    time.sleep(0.3)
    p.start()
    start_time = time.time()
class Database:

    def __init__(self):
        db_url = os.getenv('DatabaseUrl')
        self.engine = create_engine(db_url, encoding='utf8', poolclass=NullPool)

        Session = sessionmaker()
        Session.configure(bind=self.engine)
        self.session = Session()

    def create_url_entity(self, urls):

        for url in urls:
            url_entity_obj = URLEntity(
                url=url,
                engine_version=self.engine_version
            )
            to_insert.append(url_entity_obj)

        self.session.bulk_save_objects(to_insert)

对于
processs=2
(或任何更高的数字),当我在有2个物理内核的MacOS上运行上述代码片段时,我没有收到任何错误。 然而,当我在我的Ubuntu18.05机器上设置了
processs=2
,它有8个物理内核,我不断地得到以下错误

david@bob:~$ screen -r myproject

    cursor, statement, parameters, context
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
psycopg2.OperationalError: SSL SYSCALL error: EOF detected


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/david/myproject/myproject/engine/core.py", line 25, in run_main_loop
    asyncio.run(self.main.run_main_loop())
  File "/usr/lib/python3.7/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
    return future.result()
  File "/home/david/myproject/myproject/core/main.py", line 150, in run_main_loop
    self.resource_database.create_markup_record(self.buffer_markup_records)
  File "/home/david/myproject/myproject/resources/db.py", line 281, in create_markup_record
    .join(RawMarkup, isouter=True) \
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3373, in all
    return list(self)
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
    distilled_params,
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/home/david/myproject/venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) SSL SYSCALL error: EOF detected

[SQL: SELECT url.id AS url_id, url.url AS url_url, raw_markup.id AS raw_markup_id 
FROM url LEFT OUTER JOIN raw_markup ON url.id = raw_markup.url_id 
WHERE url.url IN (%(url_1)s, %(url_2)s, %(url_3)s, %(url_4)s, %(url_5)s, %(url_6)s, %(url_7)s, %(url_8)s, %(url_9)s, %(url_10)s, %(url_11)s, %(url_12)s, %(url_13)s, %(url_14)s, %(url_15)s, %(url_16)s, %(url_17)s, %(url_18)s)]
[parameters: {'url_1': 'placeholder_string_1', 'url_2': 'placeholder_string_2', 'url_3': 'placeholder_string_3', ...'url_18': 'placeholder_string_18'}]
(Background on this error at: http://sqlalche.me/e/13/e3q8)
我的Ubuntu上的数据库要大得多(Ubuntu机器最大的表中有
34'744'900
行,而我的本地机器大约有
1000'000
行)

其他信息:

  • 我正在mac(10.15版)上制作我的应用程序原型。然后我上传我的代码在集群上执行。现在,这两台机器的Postgres版本是等效的(Postgres 10)
  • 我将集群的Postgres版本升级到12。这不会改变任何事情
  • 我更频繁地进行了承诺,但这并没有改变任何事情
  • 当我生成更多进程(即
    进程=7
    )时,错误会不断增加
  • 我使用
    sqlalchemy==1.3
  • 增加niceness
    nice-n17python-m运行
    并不会改变任何事情(感觉上,这会使异步工作者运行得更快)
  • 我在某个地方读到,这可能是由于在Ubuntu上行为怪异的libpq。有没有想到这会导致它
你知道是什么导致了这个错误吗?我可以试着解决这个问题吗


编辑:

我刚刚在MacOS上使用bash通过以下命令启动了两个单(无多处理)异步进程:

trap 'kill %1' SIGINT
python -m myproject.core.main | tee 1.log | sed -e 's/^/[Command1] /' & python -m myproject.core.main | tee 2.log | sed -e 's/^/[Command2] /'
它现在也只是完全阻止执行并返回Postgres SSL错误

SSL error in data received
protocol: <asyncio.sslproto.SSLProtocol object at 0x115f0b828>
transport: <_SelectorSocketTransport fd=15 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/asyncio/sslproto.py", line 526, in data_received
    ssldata, appdata = self._sslpipe.feed_ssldata(data)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/asyncio/sslproto.py", line 207, in feed_ssldata
    self._sslobj.unwrap()
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/ssl.py", line 767, in unwrap
    return self._sslobj.shutdown()
ssl.SSLError: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2609)
收到的数据中存在SSL错误 协议:写入
在使用连接池时,以及在使用通过create_Engine()创建的引擎时,池连接不共享给分支进程是至关重要的。TCP连接被表示为文件描述符,通常跨进程边界工作,这意味着这将导致代表两个或多个完全独立的Python解释器状态并发访问文件描述符。
,可能池是跨进程共享的


编辑3:当我使用构造多进程的
python3.8
时,我得到了相同的错误


编辑4:当我将查询更改为多个非常频繁提交的简单insert语句(而不是批量插入语句和不频繁提交语句)时,此错误发生的频率要低得多。

而不是

class AsyncProcessWrapper:

    def __init__(self):
        resource_database = Database()
        self.main = Main(database=resource_database)

    def run(self):
        asyncio.run(self.main.run_main_loop())
你应该做:

class AsyncProcessWrapper:

    def __init__(self):
        self.name = 'PROC:' + ''.join(random.choice(string.ascii_uppercase) for _ in range(4))

    def run_main_loop(self):

        self.resource_database = Database() 
        self.resource_database.engine.dispose()
        self.main = Main(name=self.name, database=self.resource_database)
        asyncio.run(self.main.run_main_loop())
否则,连接池将在父进程中生成,您希望根据(
\uuuuu init\uuuu
不是由子进程调用的,而是仍在父循环中)

而不是

class AsyncProcessWrapper:

    def __init__(self):
        resource_database = Database()
        self.main = Main(database=resource_database)

    def run(self):
        asyncio.run(self.main.run_main_loop())
你应该做:

class AsyncProcessWrapper:

    def __init__(self):
        self.name = 'PROC:' + ''.join(random.choice(string.ascii_uppercase) for _ in range(4))

    def run_main_loop(self):

        self.resource_database = Database() 
        self.resource_database.engine.dispose()
        self.main = Main(name=self.name, database=self.resource_database)
        asyncio.run(self.main.run_main_loop())

由于连接池会在父进程中生成,您希望避免此情况(
\uuuuu init\uuuu
未被子进程调用,但仍在父循环中)

您是否可以检查postgresql日志文件中是否存在有意义的内容?@MichalT我正在尝试安装日志。。。另外,当我在conf文件中停用
SSL=off
时,情况会有所好转,但现在我得到了
sqlalchemy.exc.OperationalError:(psycopg2.OperationalError)服务器意外关闭了连接,这可能意味着服务器在处理请求之前或处理请求时异常终止。
错误。我将发布日志asap@MichalT我现在添加了postgres错误日志:)我认为这个问题似乎相关,但不确定您是否可以检查postgresql日志文件中是否有什么有意义的内容?@MichalT我正在尝试安装日志。。。另外,当我在conf文件中停用
SSL=off
时,情况会有所好转,但现在我得到了
sqlalchemy.exc.OperationalError:(psycopg2.OperationalError)服务器意外关闭了连接,这可能意味着服务器在处理请求之前或处理请求时异常终止。
错误。我将发布日志asap@MichalT我现在添加了postgres错误日志:)我认为这个问题似乎相关,但不确定