Python 将文件中的字符串分组到字典中_Python_Python 3.x_Dictionary

Python 将文件中的字符串分组到字典中

python python-3.x dictionary

Python 将文件中的字符串分组到字典中,python,python-3.x,dictionary,Python,Python 3.x,Dictionary,我正在尝试将列表中的字符串分组到字典中。我读入一个文件以获得字符串列表。我想获取该列表并按id对所有项目进行分组这是文件（logtest.txt）包含的内容 Id: 1 FATAL ERROR: Network error: Connection timed out Done Return Code: 0 Id: 2 FATAL ERROR: Network error: Connection timed out Done Return Code: 0 Id: 3 FATAL E

我正在尝试将列表中的字符串分组到字典中。我读入一个文件以获得字符串列表。我想获取该列表并按id对所有项目进行分组

这是文件（logtest.txt）包含的内容

Id: 1

FATAL ERROR: Network error: Connection timed out
Done

Return Code: 0

Id: 2

FATAL ERROR: Network error: Connection timed out
Done

Return Code: 0

Id: 3

FATAL ERROR: Network error: Connection timed out
Done

Return Code: 0

到目前为止，我已将文件中的所有行读取到一个列表中。然后，我想将这些字符串按

id

编号分组到一个字典中，其中键是

id

编号，值是从

id:1

到下一个包含

id:

的字符串的所有字符串

def getAllTheLinesInLogFile():
    f = open('logtest.txt', 'r')
    return f.readlines()

def getDictOfItems(allLinesInFile):
    dict = {}
    # ???
    # items = allLinesInFile.groupby()
    for item in items:
        print("{0}".format(item))
    return dict

logFile = open('logtest.txt', 'w+')

allLinesInLogFile = getAllTheLinesInLogFile()
dictOfItems = getDictOfItems(allLinesInLogFile)
for item in dictOfItems:
    print(item.key)

您可以使用

itertools.groupby

对按

Id:

分隔的部分进行分组：

from itertools import groupby
with open("in.txt") as f:
    d = {}
    groups = groupby(f, lambda x: x.startswith("Id:"))
    for k, v in groups:
        if k: # if we have a line with "Id:.."
            # use the line as the key
            k = next(v).rstrip() 
            # call next on the grouper object extracting 
            # the second item which is our section of lines
            d[k] = list(map(str.rstrip, next(groups)[1]))

输入：

Id: 1
FATAL ERROR: Network error: Connection timed out
Done
Return Code: 0
Id: 2
FATAL ERROR: Network error: Connection timed out
Done
Return Code: 0
Id: 3
FATAL ERROR: Network error: Connection timed out
Done
Return Code: 0

输出：

  from pprint import pprint as pp
  {'Id: 1': ['FATAL ERROR: Network error: Connection timed out',
       'Done',
       'Return Code: 0'],
 'Id: 2': ['FATAL ERROR: Network error: Connection timed out',
       'Done',
       'Return Code: 0'],
 'Id: 3': ['FATAL ERROR: Network error: Connection timed out',
       'Done',
       'Return Code: 0']}

如果您的数据实际上有多个空行，那么代码仍然可以工作，如果您不需要这些空行，可以对它们进行过滤。如果要保留换行符，只需删除

str.rstrip

调用

如果您计划在完成某些工作后覆盖该文件，则边写边写tempfile可能是更好的方法。

我不完全清楚您的要求，但这可能会有所帮助：

with open('logtest.txt', 'r') as logFile:
    id_ = None
    dictOfItems = {}
    lines = []

    for line in logFile:
        if line.startswith("Id: "):
            if id_ is not None:
                dictOfItems[id_] = lines
                lines = []
            id_ = int(line[4:])
        else:
            lines.append(line)

for key, item in dictOfItems.items():
    print(key, item)