Python合并目录中的文件_Python_File_Merge_Concatenation_Cat

Python合并目录中的文件

python file merge

Python合并目录中的文件,python,file,merge,concatenation,cat,Python,File,Merge,Concatenation,Cat,我在一个目录中有数千个文件，其模式为YYYY/MM/DD/HH/MM： 201801010000.txt 2018010001.txt 2018010002.txt 我只想保留小时数，所以我需要每天每小时将60个文件合并成一个文件。我不知道如何搜索文件名来获得我想要的60个文件。这是我写的 def concat_files(path): file_list = os.listdir(path) with open(datetime.datetime.now(), "w")

我在一个目录中有数千个文件，其模式为YYYY/MM/DD/HH/MM：

201801010000.txt
2018010001.txt
2018010002.txt

我只想保留小时数，所以我需要每天每小时将60个文件合并成一个文件。我不知道如何搜索文件名来获得我想要的60个文件。这是我写的

def concat_files(path):
    file_list = os.listdir(path)
    with open(datetime.datetime.now(), "w") as outfile:
        for filename in sorted(file_list):
            with open(filename, "r") as infile:
                outfile.write(infile.read())

如何命名文件以保留日期？我现在使用datetime，但它会覆盖当前文件名。使用我的代码，我将所有文件合并到一个文件中，我应该将每个%60合并到一个不同的文件中。

您可以使用

glob

仅获取所需的文件。它允许您在搜索文件时传入要匹配的模式。在下面的最后一行中，它将只查找以

'201801100'

开头、有两个字符、以

'.txt'

结尾的文件

from glob import glob

def concat_files(dir_path, file_pattern):
    file_list = glob(os.path.join(dir_path, file_pattern))
    with open(datetime.datetime.now(), "w") as outfile:
        for filename in sorted(file_list):
            with open(filename, "r") as infile:
                outfile.write(infile.read())

concat_files('C:/path/to/directory', '2018010100??.txt')

您可以使用

glob

获取所需的文件。它允许您在搜索文件时传入要匹配的模式。在下面的最后一行中，它将只查找以

'201801100'

开头、有两个字符、以

'.txt'

结尾的文件

from glob import glob

def concat_files(dir_path, file_pattern):
    file_list = glob(os.path.join(dir_path, file_pattern))
    with open(datetime.datetime.now(), "w") as outfile:
        for filename in sorted(file_list):
            with open(filename, "r") as infile:
                outfile.write(infile.read())

concat_files('C:/path/to/directory', '2018010100??.txt')

你没有走那么远，你只需要交换你的逻辑：

file_list = os.listdir(path)
for filename in sorted(file_list):
    out_filename = filename[:-6] + '.txt'
    with open(out_filename, 'a') as outfile:
        with open(path + '/' + filename, 'r') as infile:
            outfile.write(infile.read())

你没有走那么远，你只需要交换你的逻辑：

file_list = os.listdir(path)
for filename in sorted(file_list):
    out_filename = filename[:-6] + '.txt'
    with open(out_filename, 'a') as outfile:
        with open(path + '/' + filename, 'r') as infile:
            outfile.write(infile.read())

试试这个

file_list = os.listdir(path)
for f in { f[:-6] for f in file_list }:
    if not f:
        continue
    with open(f + '.txt', 'a') as outfile:
        for file in sorted([ s for s in file_list if s.startswith(f)]):
            with open(path + '/' + file, 'r') as infile:
                outfile.write(infile.read())
            #os.remove(path + '/' + file) # optional

试试这个

file_list = os.listdir(path)
for f in { f[:-6] for f in file_list }:
    if not f:
        continue
    with open(f + '.txt', 'a') as outfile:
        for file in sorted([ s for s in file_list if s.startswith(f)]):
            with open(path + '/' + file, 'r') as infile:
                outfile.write(infile.read())
            #os.remove(path + '/' + file) # optional

如果文件名的格式已为YYMMDDHHMM，是否可以删除扩展名

.txt

之前的最后两个字符？我想，将

groupby

和

datetime.strtime

结合使用将很容易解决此问题。你能详细说明一下输入和输出吗？如果文件名的格式已经是YYMMDDHHMM，你能不能删除扩展名

.txt

前的最后两个字符？我认为，

groupby

和

datetime.strtime

的组合很容易解决这个问题。你能详细介绍一下输入和输出吗？欢迎来到Stack Overflow！虽然回答问题很好，我们对此表示欢迎，但也有必要解释您的代码作为解决方案做了什么。将相关解释添加到您的答案中。欢迎来到堆栈溢出！虽然回答问题很好，我们对此表示欢迎，但也有必要解释您的代码作为解决方案做了什么。将相关解释添加到您的答案中。