Python 通过所有输入文件循环任务
我正在尝试计算我提供的所有.txt文件中的所有As、Bs和Cs,并创建一个.csv文件,其中逐个列出这些字母的计数 这里的代码实现了我想要的所有功能,但只使用我提供的最后一个文件,而不是所有文件 我做错了什么Python 通过所有输入文件循环任务,python,Python,我正在尝试计算我提供的所有.txt文件中的所有As、Bs和Cs,并创建一个.csv文件,其中逐个列出这些字母的计数 这里的代码实现了我想要的所有功能,但只使用我提供的最后一个文件,而不是所有文件 我做错了什么 import glob import csv #This will print out all files loaded in the same directory and print them out for filename in glob.glob('*.txt*'):
import glob
import csv
#This will print out all files loaded in the same directory and print them out
for filename in glob.glob('*.txt*'):
print(filename)
#A B and C
substringA = "A"
Head1 = (open(filename, 'r').read().count(substringA))
substringB = "B"
Head2 = (open(filename, 'r').read().count(substringB))
substringC = "C"
Head3 = (open(filename, 'r').read().count(substringC))
header = ("File", "A Counts" ,"B Counts" ,"C Counts")
analyzed = (filename, Head1, Head2, Head3)
#This will write a file named Analyzed.csv
with open('Analyzed.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(header)
writer.writerow(analyzed)
缩进丢失,并在追加模式下打开
分析.csv
a
:
import glob
import csv
#This will print out all files loaded in the same directory and print them out
for filename in glob.glob('*.txt*'):
print(filename)
#A B and C
substringA = "A"
Head1 = (open(filename, 'r').read().count(substringA))
substringB = "B"
Head2 = (open(filename, 'r').read().count(substringB))
substringC = "C"
Head3 = (open(filename, 'r').read().count(substringC))
header = ("File", "A Counts" ,"B Counts" ,"C Counts")
analyzed = (filename, Head1, Head2, Head3)
#This will write a file named Analyzed.csv
with open('Analyzed.csv', 'a') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(header)
writer.writerow(analyzed)
编辑:删除不受支持的newline=”“
参数您可以尝试以下操作:
from itertools import chain
from collections import Counter
for filename in glob.glob('*.txt*'):
data = chain.from_iterable([list(i.strip("\n")) for i in open(filename)])
the_count = Counter(data)
with open('Analyzed.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(filename)
writer.writerow("A count: {}".format(the_count["A"]))
writer.writerow("B count: {}".format(the_count["B"]))
writer.writerow("C count: {}".format(the_count["C"]))
您还需要做另一个小改动:您需要以追加的形式打开,而不是写入,以及缩进。请注意,当您以附加方式打开时,不会覆盖以前存在的任何内容,因此我添加了顶部的部分以删除csv中已有的任何内容
import glob
import csv
#This will delete anything in Analzyed.csv if it exists and replace it with the header
with open('Analyzed.csv','w') as csvfile:
writer = csv.writer(csvfile)
header = ("File", "A Counts" ,"B Counts" ,"C Counts")
writer.writerow(header)
for filename in glob.glob('*.txt*'):
print(filename)
#A B and C
substringA = "A"
Head1 = (open(filename, 'r').read().count(substringA))
substringB = "B"
Head2 = (open(filename, 'r').read().count(substringB))
substringC = "C"
Head3 = (open(filename, 'r').read().count(substringC))
header = ("File", "A Counts" ,"B Counts" ,"C Counts")
analyzed = (filename, Head1, Head2, Head3)
#This will write a file named Analyzed.csv
with open('Analyzed.csv', 'a', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(analyzed)
以上是我的解决方案,使尽可能多的代码保持不变。但是,理想情况下,您只需在文件开头打开一次文件。这是您将如何做到的:
import glob
import csv
with open('Analyzed.csv','w') as csvfile:
writer = csv.writer(csvfile)
header = ("File", "A Counts" ,"B Counts" ,"C Counts")
writer.writerow(header)
for filename in glob.glob('*.txt*'):
print(filename)
#A B and C
substringA = "A"
Head1 = (open(filename, 'r').read().count(substringA))
substringB = "B"
Head2 = (open(filename, 'r').read().count(substringB))
substringC = "C"
Head3 = (open(filename, 'r').read().count(substringC))
analyzed = (filename, Head1, Head2, Head3)
writer.writerow(analyzed)
正在计数的代码
A
B
和C
是在for循环中还是在for循环之外?只要将计数代码向右移动4个空格,它就会在for
循环中:)我想这正是我的问题。我不知道如何循环计算所有文件的代码。将覆盖已分析的文件。TXT您必须在for之前打开输出文件,或者以追加模式打开它(但它不会删除以前的运行数据)。我将其更改为以追加模式打开a
,非常感谢。这很有效。如何避免它多次写入表头。writer.writerow(header)只是第一次这样做。哎呀,我没有注意到这一部分,我将添加一个补丁