如何将python中for循环的输出写入csv格式的文件？_Python_File_Loops_Csv

如何将python中for循环的输出写入csv格式的文件？

python file loops csv

如何将python中for循环的输出写入csv格式的文件？,python,file,loops,csv,Python,File,Loops,Csv,下面是python脚本，用于确定在不同文件列表中是否找到某些单词 experiment=open('potentiation.txt') lines=experiment.read().splitlines() receptors=['crystal_1.txt', 'modeller_1.txt', 'moe_1.txt', 'nci5_modeller0000_1.txt', 'nci5_modeller0001_1.txt', 'nci5_mod

下面是python脚本，用于确定在不同文件列表中是否找到某些单词

experiment=open('potentiation.txt')
lines=experiment.read().splitlines()
receptors=['crystal_1.txt', 'modeller_1.txt', 'moe_1.txt',
           'nci5_modeller0000_1.txt', 'nci5_modeller0001_1.txt',
           'nci5_modeller0002_1.txt', 'nci5_modeller0003_1.txt',
           'nci5_modeller0004_1.txt', 'nci5_modeller0005_1.txt',
           'nci5_modeller0006_1.txt', 'nci5_modeller0007_1.txt',
           'nci5_modeller0008_1.txt', 'nci5_modeller0009_1.txt',
           'nci5_modeller0010_1.txt', 'nci5_modeller0011_1.txt',
           'nci5_moe0000_1.txt', 'nci5_moe0001_1.txt', 'nci5_moe0002_1.txt',
           'nci5_moe0003_1.txt', 'nci5_moe0004_1.txt', 'nci5_moe0005_1.txt',
           'nci5_moe0006_1.txt', 'nci5_moe0007_1.txt', 'nci5_moe0008_1.txt',
           'nci5_moe0009_1.txt', 'nci5_moe0010_1.txt', 'nci5_moe0011_1.txt',
           'nci5_moe0012_1.txt', 'nci5_moe0013_1.txt', 'nci5_moe0014_1.txt']

for ligand in lines:
    for protein in receptors:
        file1=open(protein,"r")
        read1=file1.read()
        find_hit=read1.find(ligand)
        if find_hit == -1:
            print ligand,protein,"Not Found"
        else:
            print ligand,protein, "Found"

该代码的输出示例如下：

345647未找到nci5\u moe0012\u 1.txt
未找到345647 nci5_moe0013_1.txt
找到345647 nci5_moe0014_1.txt

我的问题是如何将输出格式化为csv文件，如下例所示

<代码>配体nci5_moe0012_1。nci5_moe_0013_1 nci5_moe_0014 345647找不到找不到

将“蛋白质”和“配体”的值添加到适当的列表（在0索引中）后，您可以将结果保存在列表中（一个列表用于配体，一个列表用于蛋白质）。之后很容易将其保存为文本文件。
要保存，请以字符串形式打开文件以写入和转换列表：

my_string = " ".join(map(str, lst))

然后保存我的_字符串

（并为每个列表执行此操作）

我认为这样做可以（假设输出文件以制表符分隔）：

更新

正如我在评论中所说，只需读取一次蛋白质文件，就可以更快地完成这项工作。为了能够做到这一点并以您想要的方式格式化输出，检查每个文件中每个配体的结果需要存储在一个数据结构中，该数据结构随着每个文件的读取而递增，然后多次检查，最后在所有操作完成后一次全部写出。为此目的，一个简单的列表就足够了，并已在下面的实施中使用

取舍是使用更多的内存，而不是反复读取和重读蛋白质文件。由于磁盘IO通常是计算机上速度最慢的东西之一，因此只要稍微增加代码复杂度，就可能获得巨大的性能提升

下面是显示此替代版本的代码：

import csv
import os

receptors = ['crystal_1', 'modeller_1', 'moe_1',
             'nci5_modeller0000_1', 'nci5_modeller0001_1',
             'nci5_modeller0002_1', 'nci5_modeller0003_1',
             'nci5_modeller0004_1', 'nci5_modeller0005_1',
             'nci5_modeller0006_1', 'nci5_modeller0007_1',
             'nci5_modeller0008_1', 'nci5_modeller0009_1',
             'nci5_modeller0010_1', 'nci5_modeller0011_1',
             'nci5_moe0000_1', 'nci5_moe0001_1', 'nci5_moe0002_1',
             'nci5_moe0003_1', 'nci5_moe0004_1', 'nci5_moe0005_1',
             'nci5_moe0006_1', 'nci5_moe0007_1', 'nci5_moe0008_1',
             'nci5_moe0009_1', 'nci5_moe0010_1', 'nci5_moe0011_1',
             'nci5_moe0012_1', 'nci5_moe0013_1', 'nci5_moe0014_1']

# initialize list of lists holding each ligand and its presence in each receptor
with open('potentiation.txt') as experiment:
    ligands = [[ligand] for ligand in (line.rstrip() for line in experiment)]

for protein in receptors:
    with open(protein + '.txt') as protein_file:
        protein_file_data = protein_file.read()
        for row in ligands:
            # determine if this ligand (row[0]) appears in protein data
            row.append('Found' if row[0] in protein_file_data else 'Not Found')

with open('output.csv', 'wb') as outfile:
    csv_writer = csv.writer(outfile, delimiter='\t')
    csv_writer.writerow(['Ligand'] + receptors)  # header row
    csv_writer.writerows(ligands)

print('output.csv file written')

或者你可以使用字典（键是配体，值是tuple

（file，find/Not Found）

。谢谢你的回答。我对python非常陌生。你能解释一下我如何将两个不同的列表写入一个文本文件并包含输出数据（find或Not Found）？它更容易理解吗？你可以使用“，”在join方法中（更多的是csv）。好的，还有一个问题，我如何将两个列表保存为一个文本文件？这里，这不是列表而是字符串！！谢谢！当我使用此代码时，我收到以下错误消息：csv_writer（[ligand，protein，“Found”if Found els“not Found”]）TypeError:“\u csv.writer”对象不可调用。有什么建议吗？谢谢！还有一个问题。^M是什么意思？它出现在每个protein\u文件之后的输出csv中？有没有办法消除它？这是一个回车符。我上次的更新可能会消除它。如果没有，可能是因为您使用的是Python 3，但没有规范请在你的问题中说明这一点（并让我知道）.Adam：在重新阅读你的问题后，我意识到我的答案只是将循环输出转换为csv格式，但没有按照你想要的方式进行安排。我的最新更新应该会纠正这一点。感谢你捕捉到这一点。实际上，脚本还有一个问题。脚本用于查找在各种pr中是否找到某个配体otein文件。但是，脚本的输出当前显示每个蛋白质文件的所有配体都“未找到”。这是不正确的，因为应该有一些“已找到”和一些“未找到”。我认为一个简单的条件表达式应该可以工作。如何最好地将其引入脚本中？

import csv
import os

receptors = ['crystal_1', 'modeller_1', 'moe_1',
             'nci5_modeller0000_1', 'nci5_modeller0001_1',
             'nci5_modeller0002_1', 'nci5_modeller0003_1',
             'nci5_modeller0004_1', 'nci5_modeller0005_1',
             'nci5_modeller0006_1', 'nci5_modeller0007_1',
             'nci5_modeller0008_1', 'nci5_modeller0009_1',
             'nci5_modeller0010_1', 'nci5_modeller0011_1',
             'nci5_moe0000_1', 'nci5_moe0001_1', 'nci5_moe0002_1',
             'nci5_moe0003_1', 'nci5_moe0004_1', 'nci5_moe0005_1',
             'nci5_moe0006_1', 'nci5_moe0007_1', 'nci5_moe0008_1',
             'nci5_moe0009_1', 'nci5_moe0010_1', 'nci5_moe0011_1',
             'nci5_moe0012_1', 'nci5_moe0013_1', 'nci5_moe0014_1']

# initialize list of lists holding each ligand and its presence in each receptor
with open('potentiation.txt') as experiment:
    ligands = [[ligand] for ligand in (line.rstrip() for line in experiment)]

for protein in receptors:
    with open(protein + '.txt') as protein_file:
        protein_file_data = protein_file.read()
        for row in ligands:
            # determine if this ligand (row[0]) appears in protein data
            row.append('Found' if row[0] in protein_file_data else 'Not Found')

with open('output.csv', 'wb') as outfile:
    csv_writer = csv.writer(outfile, delimiter='\t')
    csv_writer.writerow(['Ligand'] + receptors)  # header row
    csv_writer.writerows(ligands)

print('output.csv file written')