Python 最大化并行请求数（aiohttp）_Python_Asynchronous_Request_Python Asyncio_Aiohttp

Python 最大化并行请求数（aiohttp）

python asynchronous

Python 最大化并行请求数（aiohttp）,python,asynchronous,request,python-asyncio,aiohttp,Python,Asynchronous,Request,Python Asyncio,Aiohttp,tl；dr：如何最大化并行发送的http请求数我正在使用aiohttp库从多个URL获取数据。我正在测试它的性能，我发现在这个过程中有一个瓶颈，一次运行更多的URL是没有帮助的我正在使用以下代码： import asyncio import aiohttp async def fetch(url, session): headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:64.0) Gecko/2

tl；dr：如何最大化并行发送的http请求数

我正在使用

aiohttp

库从多个URL获取数据。我正在测试它的性能，我发现在这个过程中有一个瓶颈，一次运行更多的URL是没有帮助的

我正在使用以下代码：

import asyncio
import aiohttp

async def fetch(url, session):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0'}
    try:
        async with session.get(
            url, headers=headers, 
            ssl = False, 
            timeout = aiohttp.ClientTimeout(
                total=None, 
                sock_connect = 10, 
                sock_read = 10
            )
        ) as response:
            content = await response.read()
            return (url, 'OK', content)
    except Exception as e:
        print(e)
        return (url, 'ERROR', str(e))

async def run(url_list):
    tasks = []
    async with aiohttp.ClientSession() as session:
        for url in url_list:
            task = asyncio.ensure_future(fetch(url, session))
            tasks.append(task)
        responses = asyncio.gather(*tasks)
        await responses
    return responses

loop = asyncio.get_event_loop()
asyncio.set_event_loop(loop)
task = asyncio.ensure_future(run(url_list))
loop.run_until_complete(task)
result = task.result().result()

使用不同长度的

url\u list

运行此程序（针对）我发现，添加更多的url一次运行只会帮助多达100个url，然后总时间开始与url数量成比例增长（换句话说，每个url的时间不会减少）。这表明，当试图立即处理这些问题时，有些事情失败了。此外，由于“一批”中有更多URL，我偶尔会收到连接超时错误

为什么会这样？到底是什么限制了这里的速度
如何检查给定计算机上可发送的最大并行请求数？（我指的是一个准确的数字——不是上述“试错法”的近似值）
如何增加一次处理的请求数

我在Windows上运行这个

编辑回复评论：

这是相同的数据，限制设置为

None

。最后只有轻微的改进，并且一次发送400个URL时存在许多连接超时错误。我最终在实际数据上使用了

limit=200

默认情况下

aiohttp

将同时连接的数量限制为

。它通过将

ClientSession

使用的默认

limit

设置为

TCPConnector

来实现。您可以通过创建自定义连接器并将其传递给会话来绕过它：

connector = aiohttp.TCPConnector(limit=None)
async with aiohttp.ClientSession(connector=connector) as session:
    # ...

但是请注意，您可能不想将这个数字设置得太高：您的网络容量、CPU、RAM和目标服务器都有自己的限制，尝试进行大量连接可能会导致故障增加

只有在混凝土机上进行试验，才能找到最佳的数量

无关的：

您不必在没有任务的情况下创建任务。大多数AsyncioAPI都接受常规的协程。例如，您的最后几行代码可以通过以下方式更改：

loop = asyncio.get_event_loop()
loop.run_until_complete(run(url_list))

如果您使用的是Python3.7，甚至只需

asyncio.run（run（url_list））

（），看到删除了人为限制的更新图会非常有趣。您是否可以编辑这个问题以包含它？@user4815162342updated@pieca我不确定aiohttp何时启动超时计时器，因此您可能不想限制连接，而是想让它

=None

并使用信号量来限制silmuntanous请求数。如何做到这一点。它可以提高性能并减少错误。@MikhailGerasimov谢谢你的链接，我会尝试这样运行它谢谢！很高兴知道这一点。