在python中合并两个txt文件最简单的方法是什么
我目前的情况是,我想将两个.txt文件合并成一个文件。txt文件是一个单词列表 示例.txt文件: 文件1:在python中合并两个txt文件最简单的方法是什么,python,file,Python,File,我目前的情况是,我想将两个.txt文件合并成一个文件。txt文件是一个单词列表 示例.txt文件: 文件1: A AND APRIL AUGUST 文件2: A AND APOSTROPHE AREA 我想将这些文件合并到一个文件中,该文件只包含发生单词的一个条目 结束文件应如下所示: A AND APOSTROPHE APRIL AREA AUGUST 当我试图通过如下方式附加文件来附加文件时,我意识到我遇到了这个问题: filenames = ['data/train/words.tx
A
AND
APRIL
AUGUST
文件2:
A
AND
APOSTROPHE
AREA
我想将这些文件合并到一个文件中,该文件只包含发生单词的一个条目
结束文件应如下所示:
A
AND
APOSTROPHE
APRIL
AREA
AUGUST
当我试图通过如下方式附加文件来附加文件时,我意识到我遇到了这个问题:
filenames = ['data/train/words.txt', 'data/test/words.txt']
with open('data/local/words.txt', 'w') as outfile:
for fname in filenames:
with open(fname) as infile:
outfile.write(infile.read())
这怎么容易做到呢?我会使用集合,因为它们不允许重复
|
是集合的并集运算符,它组合了两个集合。集合是无序的,因此最后必须将它们转换回列表,然后对它们进行排序
file1 = open("file1.txt")
file2 = open("file2.txt")
out = open("fileOUT.txt", "w")
words = set(file1.read().split("\n")) # Create a set
words = words | set(file2.read().split("\n")) # Combine with other word list
out.write("\n".join(sorted(list(words))))
# Now close the files
out.close()
file1.close()
file2.close()
我会使用集合,因为它们不允许重复
|
是集合的并集运算符,它组合了两个集合。集合是无序的,因此最后必须将它们转换回列表,然后对它们进行排序
file1 = open("file1.txt")
file2 = open("file2.txt")
out = open("fileOUT.txt", "w")
words = set(file1.read().split("\n")) # Create a set
words = words | set(file2.read().split("\n")) # Combine with other word list
out.write("\n".join(sorted(list(words))))
# Now close the files
out.close()
file1.close()
file2.close()
将两个文件读入集合并写回两者的并集:
def read_file(fname):
with open(fname) as fobj:
return set(entry.strip() for entry in fobj)
data1 = read_file('myfile1.txt')
data2 = read_file('myfile2.txt')
merged = data1.union(data2)
with open('merged.txt', 'w') as fout:
for word in sorted(merged):
fout.write('{}\n'.format(word))
merged.txt的内容
:
A
AND
APOSTROPHE
APRIL
AREA
AUGUST
将两个文件读入集合并写回两者的并集:
def read_file(fname):
with open(fname) as fobj:
return set(entry.strip() for entry in fobj)
data1 = read_file('myfile1.txt')
data2 = read_file('myfile2.txt')
merged = data1.union(data2)
with open('merged.txt', 'w') as fout:
for word in sorted(merged):
fout.write('{}\n'.format(word))
merged.txt的内容
:
A
AND
APOSTROPHE
APRIL
AREA
AUGUST
将所有单词读入一个集合(自动删除重复项),然后将该集合写入输出文件。由于集合是无序的,所以在将其内容写入文件之前,我们需要手动对集合进行排序
# Add all words from the files
filenames = ['data/train/words.txt', 'data/test/words.txt']
words = set()
for fname in filenames:
with open(fname) as infile:
words |= set(infile.readlines())
# Sort the words
words = sorted(words) # Now words is a list, not a set!
# Write the result to a file
with open('data/local/words.txt', 'w') as outfile:
outfile.writelines(words)
将所有单词读入一个集合(自动删除重复项),然后将该集合写入输出文件。由于集合是无序的,所以在将其内容写入文件之前,我们需要手动对集合进行排序
# Add all words from the files
filenames = ['data/train/words.txt', 'data/test/words.txt']
words = set()
for fname in filenames:
with open(fname) as infile:
words |= set(infile.readlines())
# Sort the words
words = sorted(words) # Now words is a list, not a set!
# Write the result to a file
with open('data/local/words.txt', 'w') as outfile:
outfile.writelines(words)
为什么这会在顶部添加一个额外的换行符?当我运行它时,它不会。为什么这会在顶部添加一个额外的换行符?当我运行它时,它不会。我的不按字母顺序排列?添加了输出排序。我的不按字母顺序排列?添加了输出排序。