Python 异步下载文件
这会从数据库下载更新的fasta文件(蛋白质序列),我使用Python 异步下载文件,python,aiohttp,python-asyncio,Python,Aiohttp,Python Asyncio,这会从数据库下载更新的fasta文件(蛋白质序列),我使用asyncio比请求工作得更快,但是我不相信下载是异步进行的 import os import aiohttp import aiofiles import asyncio folder = '~/base/fastas/proteomes/' upos = {'UP000005640': 'Human_Homo_sapien', 'UP000002254': 'Dog_Boxer_Canis_Lupus_famili
asyncio
比请求
工作得更快,但是我不相信下载是异步进行的
import os
import aiohttp
import aiofiles
import asyncio
folder = '~/base/fastas/proteomes/'
upos = {'UP000005640': 'Human_Homo_sapien',
'UP000002254': 'Dog_Boxer_Canis_Lupus_familiaris',
'UP000002311': 'Yeast_Saccharomyces_cerevisiae',
'UP000000589': 'Mouse_Mus_musculus',
'UP000006718': 'Monkey_Rhesus_macaque_Macaca_mulatta',
'UP000009130': 'Monkey_Cynomolgus_Macaca_fascicularis',
'UP000002494': 'Rat_Rattus_norvegicus',
'UP000000625': 'Escherichia_coli',
}
#https://www.uniprot.org/uniprot/?query=proteome:UP000005640&format=fasta Example link
startline = r'https://www.uniprot.org/uniprot/?query=proteome:'
endline = r'&format=fasta&include=False' #include is true to include isoforms, make false for only canonical sequences
async def fetch(session, link, folderlocation, name):
async with session.get(link, timeout=0) as response:
try:
file = await aiofiles.open(folderlocation, mode='w')
file.write(await response.text())
await file.close()
print(name, 'ended')
except FileNotFoundError:
loc = ''.join((r'/'.join((folderlocation.split('/')[:-1])), '/'))
command = ' '.join(('mkdir -p', loc))
os.system(command)
file = await aiofiles.open(folderlocation, mode='w')
file.write(await response.text())
await file.close()
print(name, 'ended')
async def rfunc():
async with aiohttp.ClientSession() as session:
for upo, name in upos.items():
print(name, 'started')
link = ''.join((startline, upo, endline))
folderlocation =''.join((folder, name, '.fasta'))
await fetch(session, link, folderlocation, name)
loop = asyncio.get_event_loop()
loop.run_until_complete(rfunc())
我运行此命令的输出:
In [5]: runfile('~/base/Fasta Proteome Updater.py')
Human_Homo_sapien started
Human_Homo_sapien ended
Dog_Boxer_Canis_Lupus_familiaris started
Dog_Boxer_Canis_Lupus_familiaris ended
Yeast_Saccharomyces_cerevisiae started
Yeast_Saccharomyces_cerevisiae ended
Mouse_Mus_musculus started
Mouse_Mus_musculus ended
Monkey_Rhesus_macaque_Macaca_mulatta started
Monkey_Rhesus_macaque_Macaca_mulatta ended
Monkey_Cynomolgus_Macaca_fascicularis started
Monkey_Cynomolgus_Macaca_fascicularis ended
Rat_Rattus_norvegicus started
Rat_Rattus_norvegicus ended
Escherichia_coli started
Escherichia_coli ended
打印输出似乎表示下载一次进行一次,这里有什么问题吗?您正在循环要下载的项目,并等待(
wait
)每个项目完成。要使所有下载同时发生,您需要安排所有下载一次执行,例如使用
那么您的代码可能如下所示:
async def rfunc():
async with aiohttp.ClientSession() as session:
await gather(
*[
fetch(
session,
''.join((startline, upo, endline)),
''.join((folder, name, '.fasta')),
name,
) for upo, name in upos.items()
]
)
loop = asyncio.get_event_loop()
loop.run_until_complete(rfunc())
我不知道答案,但我可以建议您在异常处理中添加一个finally子句,并将'file=await aiofiles.open(folderlocation,mode='w')file.write(await response.text())wait file.close()print(name,'end')放在里面吗