Python 求一组剩余数的所有邻域_Python_Biopython

Python 求一组剩余数的所有邻域

python

Python 求一组剩余数的所有邻域,python,biopython,Python,Biopython,我在centerResidueList=[10014017053]中保存了一个剩余数列表，我正试图从这组剩余数中获取所有相邻的剩余数。目前我正在使用下面的脚本，我正在处理整个PDB文件并生成一个距离为10.0的原子对列表，然后遍历该列表并检查所有邻域列表中的剩余数是否对应于中心剩余数列表中的剩余数 from Bio.PDB import * centerResidueList = [100, 140, 170, 53] neighbours_resi_number = [] structur

我在

centerResidueList=[10014017053]

中保存了一个剩余数列表，我正试图从这组剩余数中获取所有相邻的剩余数。
目前我正在使用下面的脚本，我正在处理整个PDB文件并生成一个距离为10.0的原子对列表，然后遍历该列表并检查

所有邻域

列表中的剩余数是否对应于

中心剩余数列表

中的剩余数

from Bio.PDB import *

centerResidueList = [100, 140, 170, 53]
neighbours_resi_number = []
structure = PDBParser().get_structure('X', "1xxx.pdb") 
atom_list = Selection.unfold_entities(structure, 'A') 
ns = NeighborSearch(atom_list)
all_neighbors = ns.search_all(10.0, "R") 
for residuepair in all_neighbors:
    resi_number = residuepair[0].id[1]
    if resi_number in centerResidueList:
        resi_number_partner = residuepair[1].id[1]
        neighbours_resi_number.append(resi_number_partner)

首先，如何仅使用CA atoms创建

atom\u列表

其次，

residuepair[0].id[1]

是否是生成剩余数的正确方法（它可以工作，但是否有一种方法可以获得此结果）

最后，有没有更好的解决方案来实现这一点？

使用

邻居搜索绝对是正确的想法-它构造了一个，可以对最近的邻居执行非常快速的查找
如果你只需要搜索几个残基，我会使用这些残基原子的方法（也许只是它们的CA原子的速度）。这将比使用search\u all（）
然后过滤更有效。我将回答您的两个问题，然后在底部提供完整的解决方案

如何仅使用CA atoms创建atom_列表
您可以使用，也可以使用列表理解（我认为列表理解更具可读性）：

其次，residuepair[0].id[1]
是否是生成剩余数的正确方法（它可以工作，但是否有一种方法可以获得此结果）
这绝对是正确的做法。但是（这是一个重要的警告），请注意，这将不会处理含有的残留物。为什么不处理剩余部分
对象本身呢

我的代码：
from Bio.PDB import NeighborSearch, PDBParser, Selection


structure = PDBParser().get_structure('X', "1xxx.pdb")

chain = structure[0]['A']  # Supply chain name for "center residues"
center_residues = [chain[resi] for resi in [100, 140, 170, 53]]
center_atoms = Selection.unfold_entities(center_residues, 'A')

atom_list = [atom for atom in structure.get_atoms() if atom.name == 'CA']
ns = NeighborSearch(atom_list)

# Set comprehension (Python 2.7+, use `set()` on a generator/list for < 2.7)
nearby_residues = {res for center_atom in center_atoms
                   for res in ns.search(center_atom.coord, 10, 'R')}

# Print just the residue number (WARNING: does not account for icodes)
print sorted(res.id[1] for res in nearby_residues)

从Bio.PDB导入邻居搜索、PDBParser、选择
structure=PDBParser（）.get_结构（'X'，“1xxx.pdb”）
链=结构[0]['A']#“中心剩余物”的供应链名称
中心_残基=[100,140,170,53]中resi的链[resi]
中心\原子=选择。展开\实体（中心\剩余“A”）
atom_list=[atom for atom in structure.get_atoms（）如果atom.name=='CA']
ns=邻域搜索（原子列表）
#集合理解（Python2.7+，在生成器/列表上使用`Set（）`表示<2.7）
附近的_残基={res表示中心_原子中的中心_原子
对于ns.search中的res（center_atom.coord，10，'R'）}
#仅打印剩余编号（警告：不考虑icodes）
打印已排序（res.id[1]用于附近_残留物中的res）
谢谢。它现在更有意义，而且比我以前的实现更快。
from Bio.PDB import NeighborSearch, PDBParser, Selection


structure = PDBParser().get_structure('X', "1xxx.pdb")

chain = structure[0]['A']  # Supply chain name for "center residues"
center_residues = [chain[resi] for resi in [100, 140, 170, 53]]
center_atoms = Selection.unfold_entities(center_residues, 'A')

atom_list = [atom for atom in structure.get_atoms() if atom.name == 'CA']
ns = NeighborSearch(atom_list)

# Set comprehension (Python 2.7+, use `set()` on a generator/list for < 2.7)
nearby_residues = {res for center_atom in center_atoms
                   for res in ns.search(center_atom.coord, 10, 'R')}

# Print just the residue number (WARNING: does not account for icodes)
print sorted(res.id[1] for res in nearby_residues)