使用Python从蛋白质数据库下载特定的.pdb文件
我一直在尝试从蛋白质数据库下载.pdb文件。我已经编写了下面的代码块来提取这些文件,但是我发现正在下载的文件包含网页使用Python从蛋白质数据库下载特定的.pdb文件,python,chemistry,Python,Chemistry,我一直在尝试从蛋白质数据库下载.pdb文件。我已经编写了下面的代码块来提取这些文件,但是我发现正在下载的文件包含网页 #Sector C - Processing block: RefinedPDBCodeList = [] #C1 with open('RefinedPDBCodeList') as inputfile: for line in inputfile: RefinedPDBCodeList.append(line.strip().split(','))
#Sector C - Processing block:
RefinedPDBCodeList = [] #C1
with open('RefinedPDBCodeList') as inputfile:
for line in inputfile:
RefinedPDBCodeList.append(line.strip().split(','))
print(RefinedPDBCodeList[0])
['101m.pdb']
import urllib.request
for i in range(0, 1): #S2 - range(0, len(RefinedPDBCodeList)):
path=urllib.request.urlretrieve('http://www.rcsb.org/pdb/explore/explore.do?structureId=101m', '101m.pdb')
看起来你的基本url搞错了。请尝试:
urllib.request.urlretrieve('http://files.rcsb.org/download/101M.pdb', '101m.pdb')
看起来你的基本url搞错了。请尝试:
urllib.request.urlretrieve('http://files.rcsb.org/download/101M.pdb', '101m.pdb')
虽然旧URL重定向到新URL,但该URL已更新,目前:
urllib.request.urlretrieve('https://files.rcsb.org/download/101M.pdb', '101m.pdb')
有关RCSB PDB提供的不同下载的完整URL列表,请参阅 虽然旧URL重定向到新URL,但URL已经更新,目前:
urllib.request.urlretrieve('https://files.rcsb.org/download/101M.pdb', '101m.pdb')
有关RCSB PDB提供的不同下载的完整URL列表,请参阅 BioPython提供了一种检索方法。但是,这依赖于PDB FTP服务。如果由于防火墙等原因FTP端口未打开,则可以使用此功能:
def download_pdb(pdbcode, datadir, downloadurl="https://files.rcsb.org/download/"):
"""
Downloads a PDB file from the Internet and saves it in a data directory.
:param pdbcode: The standard PDB ID e.g. '3ICB' or '3icb'
:param datadir: The directory where the downloaded file will be saved
:param downloadurl: The base PDB download URL, cf.
`https://www.rcsb.org/pages/download/http#structures` for details
:return: the full path to the downloaded PDB file or None if something went wrong
"""
pdbfn = pdbcode + ".pdb"
url = downloadurl + pdbfn
outfnm = os.path.join(datadir, pdbfn)
try:
urllib.request.urlretrieve(url, outfnm)
return outfnm
except Exception as err:
print(str(err), file=sys.stderr)
return None
BioPython提供了一种检索方法。但是,这依赖于PDB FTP服务。如果由于防火墙等原因FTP端口未打开,则可以使用此功能:
def download_pdb(pdbcode, datadir, downloadurl="https://files.rcsb.org/download/"):
"""
Downloads a PDB file from the Internet and saves it in a data directory.
:param pdbcode: The standard PDB ID e.g. '3ICB' or '3icb'
:param datadir: The directory where the downloaded file will be saved
:param downloadurl: The base PDB download URL, cf.
`https://www.rcsb.org/pages/download/http#structures` for details
:return: the full path to the downloaded PDB file or None if something went wrong
"""
pdbfn = pdbcode + ".pdb"
url = downloadurl + pdbfn
outfnm = os.path.join(datadir, pdbfn)
try:
urllib.request.urlretrieve(url, outfnm)
return outfnm
except Exception as err:
print(str(err), file=sys.stderr)
return None