在python中合并两个txt文件最简单的方法是什么

在python中合并两个txt文件最简单的方法是什么,python,file,Python,File,我目前的情况是,我想将两个.txt文件合并成一个文件。txt文件是一个单词列表 示例.txt文件: 文件1: A AND APRIL AUGUST 文件2: A AND APOSTROPHE AREA 我想将这些文件合并到一个文件中,该文件只包含发生单词的一个条目 结束文件应如下所示: A AND APOSTROPHE APRIL AREA AUGUST 当我试图通过如下方式附加文件来附加文件时,我意识到我遇到了这个问题: filenames = ['data/train/words.tx

我目前的情况是,我想将两个.txt文件合并成一个文件。txt文件是一个单词列表

示例.txt文件:

文件1:

A
AND
APRIL
AUGUST
文件2:

A
AND
APOSTROPHE
AREA
我想将这些文件合并到一个文件中,该文件只包含发生单词的一个条目

结束文件应如下所示:

A
AND
APOSTROPHE
APRIL
AREA
AUGUST
当我试图通过如下方式附加文件来附加文件时,我意识到我遇到了这个问题:

filenames = ['data/train/words.txt', 'data/test/words.txt']
with open('data/local/words.txt', 'w') as outfile:
    for fname in filenames:
        with open(fname) as infile:
            outfile.write(infile.read())

这怎么容易做到呢?

我会使用集合,因为它们不允许重复
|
是集合的并集运算符,它组合了两个集合。集合是无序的,因此最后必须将它们转换回列表,然后对它们进行排序

file1 = open("file1.txt")
file2 = open("file2.txt")

out = open("fileOUT.txt", "w")

words = set(file1.read().split("\n")) # Create a set
words = words | set(file2.read().split("\n")) # Combine with other word list

out.write("\n".join(sorted(list(words))))

# Now close the files

out.close()
file1.close()
file2.close()

我会使用集合,因为它们不允许重复
|
是集合的并集运算符,它组合了两个集合。集合是无序的,因此最后必须将它们转换回列表,然后对它们进行排序

file1 = open("file1.txt")
file2 = open("file2.txt")

out = open("fileOUT.txt", "w")

words = set(file1.read().split("\n")) # Create a set
words = words | set(file2.read().split("\n")) # Combine with other word list

out.write("\n".join(sorted(list(words))))

# Now close the files

out.close()
file1.close()
file2.close()

将两个文件读入集合并写回两者的并集:

def read_file(fname):
    with open(fname) as fobj:
        return set(entry.strip() for entry in fobj)

data1 = read_file('myfile1.txt')
data2 = read_file('myfile2.txt')

merged = data1.union(data2) 

with open('merged.txt', 'w') as fout:
    for word in sorted(merged):
        fout.write('{}\n'.format(word))
merged.txt的内容

A
AND
APOSTROPHE
APRIL
AREA
AUGUST

将两个文件读入集合并写回两者的并集:

def read_file(fname):
    with open(fname) as fobj:
        return set(entry.strip() for entry in fobj)

data1 = read_file('myfile1.txt')
data2 = read_file('myfile2.txt')

merged = data1.union(data2) 

with open('merged.txt', 'w') as fout:
    for word in sorted(merged):
        fout.write('{}\n'.format(word))
merged.txt的内容

A
AND
APOSTROPHE
APRIL
AREA
AUGUST

将所有单词读入一个集合(自动删除重复项),然后将该集合写入输出文件。由于集合是无序的,所以在将其内容写入文件之前,我们需要手动对集合进行排序

# Add all words from the files
filenames = ['data/train/words.txt', 'data/test/words.txt']
words = set()
for fname in filenames:
    with open(fname) as infile:
        words |= set(infile.readlines())

# Sort the words
words = sorted(words)  # Now words is a list, not a set!

# Write the result to a file
with open('data/local/words.txt', 'w') as outfile:
    outfile.writelines(words)

将所有单词读入一个集合(自动删除重复项),然后将该集合写入输出文件。由于集合是无序的,所以在将其内容写入文件之前,我们需要手动对集合进行排序

# Add all words from the files
filenames = ['data/train/words.txt', 'data/test/words.txt']
words = set()
for fname in filenames:
    with open(fname) as infile:
        words |= set(infile.readlines())

# Sort the words
words = sorted(words)  # Now words is a list, not a set!

# Write the result to a file
with open('data/local/words.txt', 'w') as outfile:
    outfile.writelines(words)

为什么这会在顶部添加一个额外的换行符?当我运行它时,它不会。为什么这会在顶部添加一个额外的换行符?当我运行它时,它不会。我的不按字母顺序排列?添加了输出排序。我的不按字母顺序排列?添加了输出排序。