Python处理多个文件_Python_Regex_File_Process

Python处理多个文件

python regex file process

Python处理多个文件,python,regex,file,process,Python,Regex,File,Process,我对文件有疑问。如何用一个循环处理6个文件。文件名： alj.csv.1, alj.csv.2, alj.csv.3, alj.csv.4 若你们想知道为什么分机是数字，那个就是转换器，但没关系。我的代码示例： def alj1(): try: import glob from datetime import datetime import datetime from date

我对文件有疑问。如何用一个循环处理6个文件。文件名：

alj.csv.1, alj.csv.2, alj.csv.3, alj.csv.4

若你们想知道为什么分机是数字，那个就是转换器，但没关系。我的代码示例：

 def alj1():
        try:
            import glob
            from datetime import datetime
            import datetime
            from datetime import timedelta, date
            logging.basicConfig(level=logging.DEBUG, filename='{}'.format(error_log))
            csv1 = '/opt/transcode/data/epg/output/CSV/ALJAZEERA'
            try:
                # Regex match

                for name in os.listdir(csv1):
                    if name.startswith('alje'): 
                        h1 = name
                        print h1


                        output_file = open('/opt/transcode/data/epg/output/CSV/ALJAZEERA/alj_pre.csv','w')
                        input_file  = open(r'/opt/transcode/data/epg/output/CSV/ALJAZEERA/','rb')
                        for line in input_file:
                            if re.search(r'^\".+?\",,,,,',line):
                                line_date = re.findall(r'\d{2}.\d{2}.\d{4}',line)[0]
                                #print line_date

                            if re.search(r'\d{2}:\d{2}:\d{2}.*', line):
                                line_all = re.findall(r'\d{2}:\d{2}:\d{2}.*',line)[0]
                                #print line_all
                                output_file.write(line_date+','+line_all+'\n')
                        output_file.close()

How can I process all files with extension is number, like example from above.

现在循环看起来像：

csv1 = '/opt/transcode/data/epg/output/CSV/ALJAZEERA/alj.csv.[0-9]'
        files = glob.iglob(csv1) 
        try:
            # Regex match
            for name in files:
                h1 = name
                print h1


                output_file = open('/opt/transcode/data/epg/output/CSV/ALJAZEERA/alj_pre.csv','w')
                with open(h1,'r') as file:

                    for line in file:
                        if re.search(r'^\".+?\",,,,,',line):
                            line_date = re.findall(r'\d{2}.\d{2}.\d{4}',line)[0]
                            #print line_date
                        if re.search(r'\d{2}:\d{2}:\d{2}.*', line):
                            line_all = re.findall(r'\d{2}:\d{2}:\d{2}.*',line)[0]
                            #print line_all
                            output_file.write(line_date+','+line_all+'\n')
                    output_file.close() 


        except:
            logging.exception("Failed to process ALJAZEERA %s"% current_time)

您导入了

glob

，但从未使用过它：

for filename in glob.iglob('alj.csv.[0-9]'):
    # Do something with `filename`

此外，将逻辑移到更简单的函数中。你会发现它更容易使用。

关于这个问题，我改成了这个循环。但当我想处理所有文件时，这个循环只会给我目录中的第一个文件。@Fox\u 01:它会给你所有文件。要么你的文件名不是那种格式，要么你的代码退出解释器。是的，它给出了目录中所有文件的名称。但当我想在所有文件上循环时，它只循环其中一个文件