Python 从PDB中去除杂原子_Python_Biopython_Protein Database

Python 从PDB中去除杂原子

python

Python 从PDB中去除杂原子,python,biopython,protein-database,Python,Biopython,Protein Database,必须删除pdb文件中的杂原子。下面是代码，但它不适用于我的测试PDB 1C4R for model in structure: for chain in model: for reisdue in chain: id = residue.id if id[0] != ' ': chain.detach_child(id) if len(chain) == 0:

必须删除pdb文件中的杂原子。下面是代码，但它不适用于我的测试PDB 1C4R

for model in structure:
    for chain in model:
        for reisdue in chain:
            id = residue.id
            if id[0] != ' ':
                chain.detach_child(id)
        if len(chain) == 0:
            model.detach_child(chain.id)

有什么建议吗？

杂原子不应该是链的一部分。但是你可以知道一个残基是否是一个杂原子：

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")

for residue in pdb.get_residues():
    tags = residue.get_full_id()

    # tags contains a tuple with (Structure ID, Model ID, Chain ID, (Residue ID))
    # Residue ID is a tuple with (*Hetero Field*, Residue ID, Insertion Code)

    # Thus you're interested in the Hetero Field, that is empty if the residue
    # is not a hetero atom or have some flag if it is (W for waters, H, etc.)

    if tags[3][0] != " ":
        # The residue is a heteroatom
    else:
        # It is not

您还可以通过以下方式获取剩余的id（不包括前三个字段）：

我正在添加相关文档的链接：

主题在第8页，“什么是剩余id？”。引述：

这有点复杂，因为PDB格式笨拙。剩余id是一个元组有三个要素：

杂合标记：这是“H_”加上杂合残基的名称（例如“H_GLC”）对于葡萄糖分子），或对于水分子为“W”

要在中添加评论并继续，请执行以下操作：

from Bio.PDB import PDBParser, PDBIO, Select

class NonHetSelect(Select):
    def accept_residue(self, residue):
        return 1 if residue.id[0] == " " else 0

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")
io = PDBIO()
io.set_structure(pdb)
io.save("non_het.pdb", NonHetSelect())

杂原子不应该是链的一部分。但是你可以知道一个残基是否是一个杂原子：

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")

for residue in pdb.get_residues():
    tags = residue.get_full_id()

    # tags contains a tuple with (Structure ID, Model ID, Chain ID, (Residue ID))
    # Residue ID is a tuple with (*Hetero Field*, Residue ID, Insertion Code)

    # Thus you're interested in the Hetero Field, that is empty if the residue
    # is not a hetero atom or have some flag if it is (W for waters, H, etc.)

    if tags[3][0] != " ":
        # The residue is a heteroatom
    else:
        # It is not

您还可以通过以下方式获取剩余的id（不包括前三个字段）：

我正在添加相关文档的链接：

主题在第8页，“什么是剩余id？”。引述：

这有点复杂，因为PDB格式笨拙。剩余id是一个元组有三个要素：

杂合标记：这是“H_”加上杂合残基的名称（例如“H_GLC”）对于葡萄糖分子），或对于水分子为“W”

要在中添加评论并继续，请执行以下操作：

from Bio.PDB import PDBParser, PDBIO, Select

class NonHetSelect(Select):
    def accept_residue(self, residue):
        return 1 if residue.id[0] == " " else 0

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")
io = PDBIO()
io.set_structure(pdb)
io.save("non_het.pdb", NonHetSelect())

杂原子不应该是链的一部分。但是你可以知道一个残基是否是一个杂原子：

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")

for residue in pdb.get_residues():
    tags = residue.get_full_id()

    # tags contains a tuple with (Structure ID, Model ID, Chain ID, (Residue ID))
    # Residue ID is a tuple with (*Hetero Field*, Residue ID, Insertion Code)

    # Thus you're interested in the Hetero Field, that is empty if the residue
    # is not a hetero atom or have some flag if it is (W for waters, H, etc.)

    if tags[3][0] != " ":
        # The residue is a heteroatom
    else:
        # It is not

您还可以通过以下方式获取剩余的id（不包括前三个字段）：

我正在添加相关文档的链接：

主题在第8页，“什么是剩余id？”。引述：

这有点复杂，因为PDB格式笨拙。剩余id是一个元组有三个要素：

杂合标记：这是“H_”加上杂合残基的名称（例如“H_GLC”）对于葡萄糖分子），或对于水分子为“W”

要在中添加评论并继续，请执行以下操作：

from Bio.PDB import PDBParser, PDBIO, Select

class NonHetSelect(Select):
    def accept_residue(self, residue):
        return 1 if residue.id[0] == " " else 0

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")
io = PDBIO()
io.set_structure(pdb)
io.save("non_het.pdb", NonHetSelect())

杂原子不应该是链的一部分。但是你可以知道一个残基是否是一个杂原子：

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")

for residue in pdb.get_residues():
    tags = residue.get_full_id()

    # tags contains a tuple with (Structure ID, Model ID, Chain ID, (Residue ID))
    # Residue ID is a tuple with (*Hetero Field*, Residue ID, Insertion Code)

    # Thus you're interested in the Hetero Field, that is empty if the residue
    # is not a hetero atom or have some flag if it is (W for waters, H, etc.)

    if tags[3][0] != " ":
        # The residue is a heteroatom
    else:
        # It is not

您还可以通过以下方式获取剩余的id（不包括前三个字段）：

我正在添加相关文档的链接：

主题在第8页，“什么是剩余id？”。引述：

这有点复杂，因为PDB格式笨拙。剩余id是一个元组有三个要素：

杂合标记：这是“H_”加上杂合残基的名称（例如“H_GLC”）对于葡萄糖分子），或对于水分子为“W”

要在中添加评论并继续，请执行以下操作：

from Bio.PDB import PDBParser, PDBIO, Select

class NonHetSelect(Select):
    def accept_residue(self, residue):
        return 1 if residue.id[0] == " " else 0

pdb = PDBParser().get_structure("1C4R", "1C4R.pdb")
io = PDBIO()
io.set_structure(pdb)
io.save("non_het.pdb", NonHetSelect())

我曾经使用代码“从中移除残留物”
它将丢失一些杂原子。我想可能是因为每次调用detach_子对象时，链都会发生变化

for model in structure: for chain in model: for reisdue in chain: id = residue.id if id[0] != ' ': chain.detach_child(id) if len(chain) == 0: model.detach_child(chain.id)
经过如下修改（只是避免动态修改iterable），它对我来说运行良好。（我在这里只使用了结构[0]）
我曾经使用代码“从中移除残留物”
它将丢失一些杂原子。我想可能是因为每次调用detach_子对象时，链都会发生变化

for model in structure: for chain in model: for reisdue in chain: id = residue.id if id[0] != ' ': chain.detach_child(id) if len(chain) == 0: model.detach_child(chain.id)
经过如下修改（只是避免动态修改iterable），它对我来说运行良好。（我在这里只使用了结构[0]）
我曾经使用代码“从中移除残留物”
它将丢失一些杂原子。我想可能是因为每次调用detach_子对象时，链都会发生变化

for model in structure: for chain in model: for reisdue in chain: id = residue.id if id[0] != ' ': chain.detach_child(id) if len(chain) == 0: model.detach_child(chain.id)
经过如下修改（只是避免动态修改iterable），它对我来说运行良好。（我在这里只使用了结构[0]）
我曾经使用代码“从中移除残留物”
它将丢失一些杂原子。我想可能是因为每次调用detach_子对象时，链都会发生变化

for model in structure: for chain in model: for reisdue in chain: id = residue.id if id[0] != ' ': chain.detach_child(id) if len(chain) == 0: model.detach_child(chain.id)
经过如下修改（只是避免动态修改iterable），它对我来说运行良好。（我在这里只使用了结构[0]）

谢谢你的来信。这检测到了杂原子，我也想把它从链上移除。有没有一种聪明的方法去除杂原子？这本身就是一个新问题。根据我的答案和biopdb_faq.pdf（第5-6页“我可以写PDB文件吗？”）中的示例，将选择代码从
if residence.get_name（）=“GLY”：
更改为
if residence.id[0]=“：
。我正在测试。我会让你知道一旦我完成测试。谢谢你的帖子。这检测到了杂原子，我也想把它从链上移除。有没有一种聪明的方法去除杂原子？这本身就是一个新问题。根据我的答案和biopdb_faq.pdf（第5-6页“我可以写PDB文件吗？”）中的示例，将选择代码从
if residence.get_name（）=“GLY”：
更改为
if residence.id[0]=“：
。我正在测试。我会让你知道一旦我完成测试。谢谢你的帖子。这检测到了杂原子，我也想把它从链上移除。有没有一种聪明的方法去除杂原子？这本身就是一个新问题。根据我的答案和biopdb_faq.pdf（第5-6页“我可以写PDB文件吗？”）中的示例，将选择代码从
if residence.get_name（）=“GLY”：
更改为
if residence.id[0]=“：
。我正在测试。我会让你知道一旦我完成测试。谢谢你的帖子。这检测到了杂原子，我也想把它从链上移除。有没有一种聪明的方法去除杂原子？这本身就是一个新问题。根据我的答案和biopdb_faq.pdf（第5-6页“我可以写PDB文件吗？”）中的示例，将选择代码从
if residence.get_name（）=“GLY”：
更改为
if residence.id[0]=“：
。我正在测试。我完成测试后会通知你的。