Python 将映射同一密钥中的值的所有文件连接起来
我有一组不同模式的词汇:Python 将映射同一密钥中的值的所有文件连接起来,python,dictionary,concatenation,Python,Dictionary,Concatenation,我有一组不同模式的词汇: dico_cluster={'cluster_1': ['CUX2', 'CUX1'], 'cluster_2': ['RFX3', 'RFX2'],'cluster_3': ['REST']} 然后我的文件夹中有文件: "/path/to/test/files/CUX1.txt" "/path/to/test/files/CUX2.txt" "/path/to/test/files/RFX3.txt" &q
dico_cluster={'cluster_1': ['CUX2', 'CUX1'], 'cluster_2': ['RFX3', 'RFX2'],'cluster_3': ['REST']}
然后我的文件夹中有文件:
"/path/to/test/files/CUX1.txt"
"/path/to/test/files/CUX2.txt"
"/path/to/test/files/RFX3.txt"
"/path/to/test/files/RFX2.txt"
"/path/to/test/files/REST.txt"
"/path/to/test/files/ZEB.txt"
"/path/to/test/files/TEST.txt"
我正在尝试连接同一集群中的文件。输出文件名应为模式连接的名称,并加下划线“\u1”
我试过这个:
filenames = glob.glob('/path/to/test/files/*.txt')
for clee in dico_cluster.keys():
fname='_'.join(dico_cluster[clee])
outfilename ='/path/to/test/outfiles/'+ fname + ".txt"
for file in filenames:
tf_file=file.split('/')[-1].split('.')[0]
if tf_file in dico_cluster[clee]:
with open(outfilename, 'wb') as outfile:
for filename in filenames:
if filename == outfilename:
# don't want to copy the output into the output
continue
with open(filename, 'rb') as readfile:
shutil.copyfileobj(readfile, outfile)
但它不起作用。我只是在连接所有文件。
我想对同一集群中的文件进行分类。我建议使用os软件包,它更易于使用 如果我理解你的问题,我会在写之前加载文件的全部内容
import os
for clee in dico_cluster.keys():
my_clusters =list(set(dico_cluster[clee]))
fname = "_".join(my_clusters)
data = list()
outfilename = os.path.join("/path/to/test/outfiles", fname + ".txt")
for file in filenames:
tmp_dict = dict()
tf_file = os.path.basename(file).split(".")[0]
if tf_file in my_clusters:
with open(file, 'rb') as f1:
data.extend([elm for elm in f1.readlines()])
with open(outfilename, "wb") as _output_file:
for elm in data:
_output_file.write(elm)