Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/design-patterns/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中循环遍历100个文本文件_Python_Pandas_Log Analysis - Fatal编程技术网

在Python中循环遍历100个文本文件

在Python中循环遍历100个文本文件,python,pandas,log-analysis,Python,Pandas,Log Analysis,我的python代码如下所示: #Loading libraries import re import pandas as pd import numpy as np import datetime #Creating an empty dataframe columns = ['A'] df_ = pd.DataFrame(columns=columns) df_ = df_.fillna(0) #Reading the data line by line with open('serve

我的python代码如下所示:

#Loading libraries
import re
import pandas as pd
import numpy as np
import datetime

#Creating an empty dataframe
columns = ['A']
df_ = pd.DataFrame(columns=columns)
df_ = df_.fillna(0)

#Reading the data line by line
with open('serverLogs.log-2020-04-30-01') as f:
    lines = f.readlines()
    #print(lines)
    for line in lines:
        parts  = line.split('OD_MAKER_DATE=') 
        df_ = df_.append(parts)
我有许多文本文件,其中文本文件名的最后两位数字在01到100之间变化,即“serverLogs.log-2020-04-30-01”、“serverLogs.log-2020-04-30-02”…“serverLogs.log-2020-04-30-100


如何在现有代码的开头创建一个for循环来循环100个文件,并在dataframe df_u中附加单独的行,而不是一次加载一个文件?我不太熟悉python。

不确定这是否是读取循环中文件的最有效的方法。但我能理解的是,对于前9个文件,需要附加0。此代码可能会解决生成所需名称的问题:

for idx in range(101):
  fname = ("serverLogs.log-2020-04-30-%d" % idx)
  with open(fname) as f:
    ...
file_count = 100 # can change it to any value
base_name = 'serverLogs.log-2020-04-30-{}'

for i in range(file_count):
    file_name = base_name.format("%.2d" % (i+1))
然后,您可以从循环中的文件中读取数据,并以与现在相同的方式进行追加:

#Reading the data line by line
with open(file_name) as f:
    lines = f.readlines()

    for line in lines:
        parts  = line.split('OD_MAKER_DATE=') 
        df_ = df_.append(parts)

您可以使用字符串格式化和循环数字1-100来读取所有100个文件

import re
import pandas as pd
import numpy as np
import datetime


columns = ['A']
df_ = pd.DataFrame(columns=columns)
df_ = df_.fillna(0)

for i in range(101):
    with open('serverLogs.log-2020-04-30-{}'.format("%.2d" % i)) as f:
        lines = f.readlines()
        #print(lines)


for line in lines:
        parts  = line.split('OD_MAKER_DATE=') 
        df_ = df_.append(parts)

您好,curlycharcoal,您的代码对于文件号10到100运行良好,但是对于文件号1到9,由于缺少前导零(即01,02…09),我收到一个找不到文件的错误。如何合并缺少的零?@RianeRoseKinuthia而不是%d,请使用%02dCool。但是,此打印的第一个文件将具有后缀00。可以更改为以下代码:对于范围(100)内的idx:fname=(“serverLogs.log-2020-04-30-%02d”%(idx+1))@Anshul Jain,在我进行更改后,现在代码正在运行。没有生成错误。当我尝试此操作时,我收到一个错误,因为您的代码中,文件从00开始,而不是01。为了消除错误,我在for循环中做了一个小改动,将“for I in range(1101)”改为“for I in range(1101)”。这可以直接使用,而无需单独处理1-9的情况,因为您需要01,02,03……。我收到一个错误。文件从00开始,而不是从01开始。
#Loading libraries
import re
import pandas as pd
import numpy as np
import datetime

#Creating an empty dataframe
columns = ['A']
df_ = pd.DataFrame(columns=columns)
df_ = df_.fillna(0)

#Reading the data line by line
file_name = 'serverLogs.log-2020-04-30-{}'
for i in range(101):
    file_name = file_name.format("%.2d" % (i+1))
    with open(file_name) as f:
        lines = f.readlines()
        #print(lines)
        for line in lines:
            parts  = line.split('OD_MAKER_DATE=')
            df_ = df_.append(parts)