Python 数据结构建议_Python_Data Structures

Python 数据结构建议

python data-structures

Python 数据结构建议,python,data-structures,Python,Data Structures,作为数据结构课程的一部分，我的老师给了我一个额外的练习，这个练习有点难度和挑战性。我试图找出解决这个问题需要使用的数据结构，但我没有任何想法，我也想在练习中自己编写代码，以提高我的python技能关于演习： 1.我有一个包含日志的文本文件，如下所示： M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading M, 1, 14/08/2019 11:40, 100, xxxx, u

作为数据结构课程的一部分，我的老师给了我一个额外的练习，这个练习有点难度和挑战性。我试图找出解决这个问题需要使用的数据结构，但我没有任何想法，我也想在练习中自己编写代码，以提高我的python技能

关于演习： 1.我有一个包含日志的文本文件，如下所示：

M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in

有两种类型的日志，M是主日志，S是从日志。我需要一个数据结构，将能够分割每一行，并抓取到一个特定的列。 i、 e M-1列将为：

M, 1, Datetime, Error Level, DeviceId, UserId, Message

但S-1列将：

S, 1, Datetime, Error Level, DeviceId, Action, Message

注意：正如您所看到的，S，1中有操作，但没有UserId

最后，我需要能够在命令行中输入我想要标准输出的列和条件（即错误级别>50）

我所吹捧的是字典，但通过这种方式，我将无法支持无限数量的版本（如果可能的话，请向我解释如何）

谢谢

这是否有帮助：

logfileasstring = """
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, User logged in, User username logged in"""
listoflist = [[v.strip() for v in r.split(",", maxsplit=6)]
               for r in logfileasstring.splitlines(keepends=False) 
               if r]

grouped = {("M", "1"): [], ("S", "1"): []}
for row in listoflist:
    datasets_for = grouped[row[0], row[1]]
    datasets_for.append(row[2:])


# must be set by script
fields = [0, 1, 2]
for k in grouped:
    print(k, "::")
    for row in grouped[k]:
        print("  -", [row[f] for f in fields])

我可能会使用

collections

包中的

namedtuple

类来保存每个已解析的项，因为它允许您通过索引号和名称访问每个字段。此外，通过传递列名列表，可以相当轻松地动态创建新的

namedtuple

类

from collections import namedtuple

Master = namedtuple('Master', ['Type', 'N', 'Datetime', 'ErrorLevel', 'DeviceId', 'UserName', 'Message'])
Slave = namedtuple('Slave', ['Type', 'N', 'Datetime', 'ErrorLevel', 'DeviceId', 'Action', 'Message'])

n_cols = 7

logfileasstring = """
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in"""


master_list = []
slave_list = []

for r in logfileasstring.splitlines(False):
    if not r:
        continue
    values = [value.strip() for value in r.split(',', n_cols - 1)]
    if r[0] == 'M':
        master_list.append(Master(*values))
    else:
        slave_list.append(Slave(*values))


print(master_list[0][6]) # by index
print(master_list[0].Message) # by column name if name known in advance
column_name = 'Message'
print(master_list[0].__getattribute__(column_name)) # by column name if name not known in advance

为

主类

和

从类

创建两个不同的类，并将它们作为类列表存储在字典中，键为

{'Master'：[主类列表….]，'Slave'：[从类列表….]}

一个简单的元组列表（每行一个元组）就足够了吗？或者，如果你想更详细一点，一份字典列表？@Nitin我找到你了，但是通过这种方式我必须声明每个版本，我希望它支持无限版本，那么我该怎么做呢？你说的“支持无限版本”是什么意思？什么版本？@Goyo版本，如m-2、m-3等，还有“s”@jgsedi，这是个主意，但我不能使用它，因为我需要制作一种只打印选定列的方法，为此，我必须在之前声明列。是的，但必须为列或至少列数设置指示器，以检测要打印的列。如果要在脚本（cli）中执行此操作，如：

script--cols 0,3,4 logfile

，则代码必须仅在最后一部分中更改，以选择行的适当字段。我在上面更改了我的代码。@jgsedi是的，我根据您的结果得到了您您通过数组中的索引访问列，我想通过名称而不是索引访问它，即stdin for columns filter将是UserIdAlso，当您在

，“

”上拆分时，您是否会无意中拆分

“关闭连接，然后重新启动”

？否。我阻止使用

split（“，”，maxsplit=6）拆分最后一个条目。

。所以最后一次拆分就在日志之前text@RolandAaronson我真的很感谢你的解释！谢谢！