Python 如何与aiohttp一起使用aiopg
我有一个应用程序,它从Postgres表中循环处理成批的URL:s,下载URL,在每次下载时运行一个处理函数,并将处理结果保存回表中 我使用aiopg和aiohttp编写了它,以使其异步运行。 在简化形式中,它看起来像:Python 如何与aiohttp一起使用aiopg,python,python-asyncio,aiohttp,aiopg,Python,Python Asyncio,Aiohttp,Aiopg,我有一个应用程序,它从Postgres表中循环处理成批的URL:s,下载URL,在每次下载时运行一个处理函数,并将处理结果保存回表中 我使用aiopg和aiohttp编写了它,以使其异步运行。 在简化形式中,它看起来像: import asyncio import aiopg from aiohttp import ClientSession, TCPConnector BATCH_SIZE = 100 dsn = "dbname=events user={} password={} host
import asyncio
import aiopg
from aiohttp import ClientSession, TCPConnector
BATCH_SIZE = 100
dsn = "dbname=events user={} password={} host={}".format(DB_USER, DB_PASSWORD, DB_HOST)
async def run():
async with ClientSession(connector=TCPConnector(ssl=False, limit=100)) as session:
async with aiopg.create_pool(dsn) as pool:
while True:
count = await run_batch(session, pool)
if count == 0:
break
async def run_batch(session, db_pool):
tasks = []
async for url in get_batch(db_pool):
task = asyncio.ensure_future(process_url(url, session, db_pool))
tasks.append(task)
await asyncio.gather(*tasks)
async def get_batch(db_pool):
sql = "SELECT id, url FROM db.urls ... LIMIT %s"
async with db_pool.acquire() as conn:
async with conn.cursor() as cur:
await cur.execute(sql, (BATCH_SIZE,))
for row in cur:
yield row
async def process_url(url, session, db_pool):
async with session.get(url, timeout=15) as response:
body = await response.read()
data = process_body(body)
await save_data(db_pool, data)
async def process_body(body):
...
return data
async def save_data(db_pool, data):
sql = "UPDATE db.urls ..."
async with db_pool.acquire() as conn:
async with conn.cursor() as cur:
await cur.execute(sql, (data,))
但有点不对劲。脚本运行的时间越长,运行的速度越慢,调用
session.get时抛出的异常也越来越多。我的猜测是,我使用Postgres连接的方式有问题,但我无法找出问题所在。任何帮助都将不胜感激 是否由于调用了save_data
而不是save_result
?@dirn否,这只是简化代码中的一个输入错误。我已经改正了,谢谢你指出!