Python：读取多个源txt文件，按条件复制到一个输出文件中_Python_Python 2.7

Python：读取多个源txt文件，按条件复制到一个输出文件中

python python-2.7

Python：读取多个源txt文件，按条件复制到一个输出文件中,python,python-2.7,Python,Python 2.7,我的目标是读取文件夹中的多个txt源文件（小尺寸），然后将根据条件选择的行复制到一个输出txt文件中。我可以用一个源文件来实现这一点，但当我尝试读取多个文件并执行相同操作时，没有输出（空）通过我的SO研究，我编写了以下代码（无输出）：谢谢你的帮助 @abe和@ppperry：我要特别感谢您先前的输入。您的代码有问题：您有两个重复变量文件和文件列表，但只使用后者每次打开文件时，都会覆盖变量data_list，该变量会删除以前读取的文件的内容在文件中搜索匹配行时，使用变量fileName

我的目标是读取文件夹中的多个txt源文件（小尺寸），然后将根据条件选择的行复制到一个输出txt文件中。我可以用一个源文件来实现这一点，但当我尝试读取多个文件并执行相同操作时，没有输出（空）

通过我的SO研究，我编写了以下代码（无输出）：

谢谢你的帮助

@abe和@ppperry：我要特别感谢您先前的输入。

您的代码有问题：

您有两个重复变量

文件

和

文件列表

，但只使用后者

每次打开文件时，都会覆盖变量

data_list

，该变量会删除以前读取的文件的内容

在文件中搜索匹配行时，使用变量

fileName

而不是

data\u list

可以简化的地方：

使用

re

模块只是为了确定一个字符串是否以另一个字符串开头，这是过分的。您可以使用

line.startswith（字母）

错误：

第14行应该查找数据列表中的行，而不是文件名

“我可以使用一个源文件执行此操作，但是当我尝试读取多个文件并执行相同操作时，我没有输出（空）。第14行到第17行应该缩进，否则在文件列表上迭代的for循环将只循环第一个文件

您甚至没有使用第4行和第5行，那么为什么要包括它们呢？它们没有效果

以下是已修复的代码，并附有注释：

import glob
import re

#path = 'C:\Doc\version 1\Output*.txt'   # read all source files with this name format
#files=glob.glob(path)

criteria = ['AB', 'CD', 'EF']   # select lines that start with criteria

list_of_files = glob.glob('./Output*.txt')

with open("P_out.txt", "a") as f_out: #use "a" so you can keep the data from the last Output.txt
    for fileName in list_of_files:
        data_list = open( fileName, "r" ).readlines()
        #indenting the below will allow you to search through all files.
        for line in data_list: #Search data_list, not fileName
            for letter in criteria:
                if re.search(letter,line):
                    f_out.writelines('{}\n'.format(line))
                    #I recommend the \n so that the text does not get concatenated when moving from file to file. 

#Really? I promise with will not lie to you. 
#f_out.close()  # 'with' construction should close files, yet I make sure they close

对于那些投票被否决的人，为什么不发表评论来证明你的判断呢？OP要求的一切都得到了满足。如果您认为可以进一步改进答案，建议进行编辑。谢谢。

我不认为4是个问题，但正如你所说的，“太过分了。”阅读答案@ppperryMy输出仍然为空。可能的问题是，行只出现在记事本++中，但如果在记事本中读取，则没有行，只有一个字符序列？是否复制并粘贴了我的代码？在我的机器上运行它，我得到了正确的输出。您确定您的输出*.txt包含以AB、CD、EF开头的行吗？是的，源文件的行以[AB、CD、EF]开头，但没有输出。正如我在前面的评论中提到的，在notepad++中，我的源文件有行，但在notepad中没有行。编程时不要使用notepad。它在编码方面很糟糕。除了记事本，什么都可以用@user1739581我想您已经看到我的解决方案起作用了。伟大的

import glob
import re

#path = 'C:\Doc\version 1\Output*.txt'   # read all source files with this name format
#files=glob.glob(path)

criteria = ['AB', 'CD', 'EF']   # select lines that start with criteria

list_of_files = glob.glob('./Output*.txt')

with open("P_out.txt", "a") as f_out: #use "a" so you can keep the data from the last Output.txt
    for fileName in list_of_files:
        data_list = open( fileName, "r" ).readlines()
        #indenting the below will allow you to search through all files.
        for line in data_list: #Search data_list, not fileName
            for letter in criteria:
                if re.search(letter,line):
                    f_out.writelines('{}\n'.format(line))
                    #I recommend the \n so that the text does not get concatenated when moving from file to file. 

#Really? I promise with will not lie to you. 
#f_out.close()  # 'with' construction should close files, yet I make sure they close