Python 如何逐行读取2个文件,并在处理后编辑每一行
我有如下代码,如果文件1中读取的行符合我的条件,我想编辑/更新我的文件1。否则,我要编辑文件2中满足另一个条件的行:Python 如何逐行读取2个文件,并在处理后编辑每一行,python,io,fopen,Python,Io,Fopen,我有如下代码,如果文件1中读取的行符合我的条件,我想编辑/更新我的文件1。否则,我要编辑文件2中满足另一个条件的行: with gzip.open('/my/file1.txt.gz', 'r') as f: for line in csv.reader(f, delimiter="\t"): if (str(line[3]) == "C"): # edit/update the line from file 1 else:
with gzip.open('/my/file1.txt.gz', 'r') as f:
for line in csv.reader(f, delimiter="\t"):
if (str(line[3]) == "C"):
# edit/update the line from file 1
else:
with gzip.open('/my/file2.txt.gz', 'r') as f2:
for line2 in csv.reader(f2, delimiter="\t"):
if line2[0] == line[0]:
# edit/update the line2 from file 2
有没有办法做到这一点?提前谢谢 在这一行中:
keys = set()
with gzip.open('/my/file1.txt.gz', 'r') as src_file:
for line in csv.reader(src_file, delimiter="\t"):
if (str(line[3]) == "C"):
# edit/update the line from file 1
else:
keys.add(line[0])
with gzip.open('/my/file2.txt.gz', 'r') as target_file:
for line in csv.reader(target_file, delimiter="\t"):
if line[0] in keys:
一些问题:
您正在尝试更新只读文件
你似乎想在读取文件的同时更新文件。对于纯文本文件,只要替换字符串的长度与其替换的字符串的长度完全相同,就可以执行此操作。但这是一个压缩文件-我认为你会破坏文件
每次要修改第二个文件时,都会重新读取该文件 如果可能的话,我建议将整个作品读入记忆,进行修改,然后将其全部写出来:import csv, gzip
from collections import defaultdict
def read_gzip(fname):
with gzip.open(fname, 'rb') as inf:
incsv = csv.reader(inf, delimiter='\t')
return list(incsv)
def write_gzip(fname, data):
with gzip.open(fname, 'wb') as outf:
outcsv = csv.writer(outf, delimiter='\t')
outcsv.writerows(data)
def main():
f1, f1_dirty = read_gzip('/my/file1.txt.gz'), False
f2, f2_dirty = read_gzip('/my/file2.txt.gz'), False
# make row index on f2
idx = defaultdict(list)
for row in f2:
idx[row[0]].append(row)
# process files
for row in f1:
if row[3] == 'C':
# modify row data
f1_dirty = True
else:
for other in idx[row[0]]:
# modify other
f2_dirty = True
if f1_dirty: write_gzip('/my/file1.txt.gz', f1)
if f2_dirty: write_gzip('/my/file2.txt.gz', f2)
if __name__=="__main__":
main()