Python 如何为每个链id保存单独的输出文件？_Python_Python 2.7_Python 3.x_Bioinformatics

Python 如何为每个链id保存单独的输出文件？

python python-2.7 python-3.x

Python 如何为每个链id保存单独的输出文件？,python,python-2.7,python-3.x,bioinformatics,Python,Python 2.7,Python 3.x,Bioinformatics,我有下面的代码从pdb文件打印预定义的序列。现在我想为每个chain_id保存单独的输出文件如何为每个链id保存单独的输出预期产出：我想为每个链id保存输出文件如果输入文件名是1AHI.PDB，在这个文件中，如果我们有四个链id A、B、C、D，那么我想要输出文件：1AHIA.txt，1AHIB.txt，1AHIC.txt，1AHID.txt。这将适用于每个输入文件。我的输入目录中还有2000多个输入文件代码： *在Ans之后编辑* 错误： Traceback (most recent

我有下面的代码从pdb文件打印预定义的序列。现在我想为每个chain_id保存单独的输出文件

如何为每个链id保存单独的输出

预期产出：

我想为每个链id保存输出文件

如果输入文件名是

1AHI.PDB

，在这个文件中，如果我们有四个链id A、B、C、D，那么我想要输出文件：

1AHIA.txt

，

1AHIB.txt

，

1AHIC.txt

，

1AHID.txt

。这将适用于每个输入文件。我的输入目录中还有2000多个输入文件

代码：

*在Ans之后编辑*

错误：

Traceback (most recent call last):
  File "C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file/Test_10.py", line 31, in test
suffix), 'w')
OSError: [Errno 22] Invalid argument: 'C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file/Final_result//C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file\\1A4ZHETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C  \n.txt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file/Test_10.py", line 94, in <module>
test()
  File "C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file/Test_10.py", line 40, in test
out_f.close()
UnboundLocalError: local variable 'out_f' referenced before assignment

回溯（最近一次呼叫最后一次）：
文件“C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_File/test_10.py”，第31行，测试中
后缀），‘w’）
OSError:[Errno 22]无效参数：“C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file/Final_result//C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_file\\1A4ZHETATM15207 C4B NAD 501 47.266 101.038 7.214 1.00 11.48 C\n.txt”
在处理上述异常期间，发生了另一个异常：
回溯（最近一次呼叫最后一次）：
文件“C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_File/test_10.py”，第94行，在
测试（）
文件“C:/Users/Vishnu/Documents/NAD/NAD/result/test_result_File/test_10.py”，第40行，测试中
结束
UnboundLocalError:赋值前引用了局部变量'out\u f'

当前代码为每个输入文件打开一个输出。但是您需要为每个out\u链项目创建一个输出文件，并且每个输入文件中可以有多个out\u链项目。因此，您需要在处理外链项目的内部循环中打开和关闭输出文件。这里有一种方法可以做到：

def test():
    fnames = glob(in_loc+'*.pdb')

    for each in fnames:
    # This is the new generated file out of input file (.txt).
        formatted_file = each.replace('pdb', 'txt')
        suffix = 'txt'

        formatted_file = formatted_file.replace(in_loc, out_loc)
        ofstem = each.replace('.pdb', '')

    # This is the input file
        in_f = open(each, 'r')

    # A new file to be opened.
        # out_f = open(formatted_file, "w")

    # Filtering results from input file
        try:
            out_chain_list = filter_file(in_f)
            for each_line in out_chain_list:
            # open and write output file for each out_chain item
                out_f = open('{}/{}{}.{}'.format(out_loc, 
                                       ofstem, 
                                       each_line, 
                                       suffix), 'a')
                out_f.write(each_line)
                out_f.close()

        # Closing all the opened files.
            in_f.close()

        except Exception as e:
            print('Exception for file: ', each, '\n', e)
            out_f.close()
            in_f.close()

您可以修改

filter\u文件

，这样您就可以收到一个带有

chain\u id

作为键的字典。如果您有格式为

{'chain\u id'：out\u chain\u list}

的

out\u chain\u dict

，您可以轻松地为每个

chain\u id

创建一个不同的文件：

def test():
    fnames = glob(in_loc+'*.pdb')

    for each in fnames:
    # This is the path for new generated file.
        path_file = each.replace(in_loc, out_loc)

    # This is the input file and iltering results from input file
        with open(each, 'r') as in_f:
            try:
                out_chain_dict = filter_file(in_f)

            except Exception as e:
                print('Exception for file: ', each, '\n', e)
                continue

            for (chain_id, out_chain_list) in out_chain_dict.items():
                # This is the new generated file out of input file (.txt).
                formatted_file = path_file.replace('.pdb', chain_id + '.txt')

                # A new file to be opened.
                with open(formatted_file, "w") as out_f:
                    for each_line in out_chain_list:
                        out_f.write(each_line)

编辑
过滤文件的： def filter_file(in_f): atom_ids = ['C4B', 'O4B', 'C1B', 'C2B', 'C3B'] chain_ids = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] order = [0, 1, 4, 3, 2] previous_chain_id = None chain_list = [] out_chain_dict = {} # Change out_chain_list to dict for line in in_f: if line.startswith('HETATM '): line = line.replace('HETATM ', 'HETATM') if line.startswith('HETATM'): line_list = line.split() chain_id = line_list[3][0] atom_id = line_list[1] if atom_id in atom_ids and chain_id in chain_ids: if chain_id != previous_chain_id: c_ls = [] if chain_list: c_l = chain_list[-5:] c_l = [c_l[i] for i in order] for i in range(5): c_ls += c_l[:4] c_ls.append('\n') c_l = c_l[-4:] + c_l[:1] try: # Here add c_ls to an existing key chain_id out_chain_dict[chain_id] += c_ls # except KeyError: # or create new chain_id key out_chain_dict[chain_id] = c_ls # if it appears at the first time chain_list.append('\n') chain_list.append(line) previous_chain_id = chain_id c_ls = [] if chain_list: c_l = chain_list[-5:] c_l = [c_l[i] for i in order] for i in range(5): c_ls += c_l[:4] c_ls.append('\n') c_l = c_l[-4:] + c_l[:1] # I guess here we add the last chain_id which corresponds to `chain_id` key try: out_chain_dict[chain_id] += c_ls except KeyError: out_chain_dict[chain_id] = c_ls return out_chain_dict 在输出文件列表中，您是指“1AHIA.txt、1AHIB.txt、1AHIC.txt、1AHID.txt”吗？您有1AHIB.txt两次，1AHID.txt没有出现。您的代码有什么问题？@Shasha99，程序没有问题。但我想修改。我已经试过了，但我不知道如何获得每个链id的输出。错误：回溯（最近一次调用）：文件“C:/Users/Vishnu/Documents/NAD/NAD/NAD/result/test\u result\u File/test\u 10.py”，test（）文件“C:/Users/Vishnu/Documents/NAD/NAD/result/test\u result\u File/test\u 10.py”中的第82行，第22行，用于测试（链id，外链列表）内-外链目录项（）：AttributeError:'list'对象没有属性'items' 您应该修改筛选文件以获得作为输出的词典。我不明白应该修改什么？在筛选文件中，而不是外链列表+=c\uls 您应该像这样做尝试：外链dict[chain\u id]+=c_-ls；除了KeyError:out_-chain_-dict[chain_-id]=c_-ls 。然后每个chain_-id都有单独的out-chain_列表。我有，但没有得到输出？你能编辑我的代码吗？我是python新手，为什么不能理解或解决这个问题？ def filter_file(in_f): atom_ids = ['C4B', 'O4B', 'C1B', 'C2B', 'C3B'] chain_ids = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] order = [0, 1, 4, 3, 2] previous_chain_id = None chain_list = [] out_chain_dict = {} # Change out_chain_list to dict for line in in_f: if line.startswith('HETATM '): line = line.replace('HETATM ', 'HETATM') if line.startswith('HETATM'): line_list = line.split() chain_id = line_list[3][0] atom_id = line_list[1] if atom_id in atom_ids and chain_id in chain_ids: if chain_id != previous_chain_id: c_ls = [] if chain_list: c_l = chain_list[-5:] c_l = [c_l[i] for i in order] for i in range(5): c_ls += c_l[:4] c_ls.append('\n') c_l = c_l[-4:] + c_l[:1] try: # Here add c_ls to an existing key chain_id out_chain_dict[chain_id] += c_ls # except KeyError: # or create new chain_id key out_chain_dict[chain_id] = c_ls # if it appears at the first time chain_list.append('\n') chain_list.append(line) previous_chain_id = chain_id c_ls = [] if chain_list: c_l = chain_list[-5:] c_l = [c_l[i] for i in order] for i in range(5): c_ls += c_l[:4] c_ls.append('\n') c_l = c_l[-4:] + c_l[:1] # I guess here we add the last chain_id which corresponds to `chain_id` key try: out_chain_dict[chain_id] += c_ls except KeyError: out_chain_dict[chain_id] = c_ls return out_chain_dict