Python 为什么我的消费者与我的生产者在队列中分开工作?
我的目标是异步调用API,并将结果(每次调用的结果)写入一个文件(1个调用->1个文件)。我认为实现这一点的一种方法是使用队列。我的意图是在生产者准备好响应后立即将其推送到队列中,然后让消费者在文件可用时立即处理(写入)文件 困惑:当我运行代码时,查看print语句,我看到首先生产者完成了,然后消费者开始使用我的输出。这似乎与我的意图不符,即一旦任务可用,消费者就开始工作。我也考虑过使用多个过程(1个用于消费者,1个用于生产者),但我不确定是否以这种方式使事情复杂化 我已创建了当前状态的说明:Python 为什么我的消费者与我的生产者在队列中分开工作?,python,python-3.x,async-await,python-asyncio,aiohttp,Python,Python 3.x,Async Await,Python Asyncio,Aiohttp,我的目标是异步调用API,并将结果(每次调用的结果)写入一个文件(1个调用->1个文件)。我认为实现这一点的一种方法是使用队列。我的意图是在生产者准备好响应后立即将其推送到队列中,然后让消费者在文件可用时立即处理(写入)文件 困惑:当我运行代码时,查看print语句,我看到首先生产者完成了,然后消费者开始使用我的输出。这似乎与我的意图不符,即一旦任务可用,消费者就开始工作。我也考虑过使用多个过程(1个用于消费者,1个用于生产者),但我不确定是否以这种方式使事情复杂化 我已创建了当前状态的说明:
import aiohttp
import asyncio
async def get_data(session, day):
async with session.post(url=SOME_URL, json=SOME_FORMAT, headers=HEADERS) as response:
return await response.text()
async def producer(q, day):
async with aiohttp.ClientSession() as session:
result = await get_data(session, day)
await q.put(result)
async def consumer(q):
while True:
outcome = await q.get()
print("Consumed:", outcome) # assuming I write files here
q.task_done()
async def main():
queue = asyncio.Queue()
days = [day for day in range(20)] # Here I normally use calendar dates instead of range
producers = [asyncio.create_task(producer(queue, day) for day in days]
consumer = asyncio.create_task(consumer(queue)
await asyncio.gather(*producers)
await queue.join()
consumer.cancel()
if __name__ == '__main__':
asyncio.run(main())
我的思路正确吗?您的代码一般都很好(除了一些语法错误,我想这是由于复制粘贴错误造成的)。实际上,所有的生产者都是在消费者开始工作之前创建的,因为他们没有什么可等待的。但是,如果有真正的工作,生产者需要做的话,你会看到他们只有在消费者开始工作之后才能完成工作,然后事情就开始工作了 这里是您的代码的编辑版本,加上显示事情确实在工作的输出
import aiohttp
import asyncio
async def get_data(session, day):
print(f"get data, day {day}")
async with session.get(url="https://www.google.com") as response:
res = await response.text()
print(f"got data, day {day}")
return res[:100]
async def producer(q, day):
async with aiohttp.ClientSession() as session:
result = await get_data(session, day)
await q.put(result)
async def consumer(q):
print("Consumer stated")
while True:
outcome = await q.get()
print("Consumed:", outcome) # assuming I write files here
asyncio.sleep(1)
q.task_done()
async def main():
queue = asyncio.Queue()
days = [day for day in range(20)] # Here I normally use calendar dates instead of range
producers = [asyncio.create_task(producer(queue, day)) for day in days]
print("main: producer tasks created")
consumer_task = asyncio.create_task(consumer(queue))
print("main: consumer task created")
await asyncio.gather(*producers)
print("main: gathered producers")
await queue.join()
consumer_task.cancel()
if __name__ == '__main__':
asyncio.run(main())
输出:
main: producer tasks created
main: consumer task created
get data, day 0
get data, day 1
get data, day 2
get data, day 3
...
get data, day 19
Consumer stated
got data, day 1
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
queue_so.py:21: RuntimeWarning: coroutine 'sleep' was never awaited
asyncio.sleep(1)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
got data, day 10
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 19
got data, day 11
got data, day 14
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 15
got data, day 17
got data, day 6
got data, day 18
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 7
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 8
got data, day 9
got data, day 2
got data, day 12
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 0
got data, day 5
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 4
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 3
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 13
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 16
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
main: gathered producers
main:已创建生产者任务
主:已创建使用者任务
获取数据,第0天
获取数据,第一天
获取数据,第2天
获取数据,第3天
...
获取数据,第19天
消费者陈述
有数据,第一天
已使用:您的代码一般都很好(除了几个语法错误,我想这是由于复制粘贴错误造成的)。实际上,所有的生产者都是在消费者开始工作之前创建的,因为他们没有什么可等待的。但是,如果有真正的工作,生产者需要做的话,你会看到他们只有在消费者开始工作之后才能完成工作,然后事情就开始工作了
这里是您的代码的编辑版本,加上显示事情确实在工作的输出
import aiohttp
import asyncio
async def get_data(session, day):
print(f"get data, day {day}")
async with session.get(url="https://www.google.com") as response:
res = await response.text()
print(f"got data, day {day}")
return res[:100]
async def producer(q, day):
async with aiohttp.ClientSession() as session:
result = await get_data(session, day)
await q.put(result)
async def consumer(q):
print("Consumer stated")
while True:
outcome = await q.get()
print("Consumed:", outcome) # assuming I write files here
asyncio.sleep(1)
q.task_done()
async def main():
queue = asyncio.Queue()
days = [day for day in range(20)] # Here I normally use calendar dates instead of range
producers = [asyncio.create_task(producer(queue, day)) for day in days]
print("main: producer tasks created")
consumer_task = asyncio.create_task(consumer(queue))
print("main: consumer task created")
await asyncio.gather(*producers)
print("main: gathered producers")
await queue.join()
consumer_task.cancel()
if __name__ == '__main__':
asyncio.run(main())
输出:
main: producer tasks created
main: consumer task created
get data, day 0
get data, day 1
get data, day 2
get data, day 3
...
get data, day 19
Consumer stated
got data, day 1
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
queue_so.py:21: RuntimeWarning: coroutine 'sleep' was never awaited
asyncio.sleep(1)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
got data, day 10
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 19
got data, day 11
got data, day 14
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 15
got data, day 17
got data, day 6
got data, day 18
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 7
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 8
got data, day 9
got data, day 2
got data, day 12
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 0
got data, day 5
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 4
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 3
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 13
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
got data, day 16
Consumed: <!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content
main: gathered producers
main:已创建生产者任务
主:已创建使用者任务
获取数据,第0天
获取数据,第一天
获取数据,第2天
获取数据,第3天
...
获取数据,第19天
消费者陈述
有数据,第一天
消耗: