Python 将文件中的字符串分组到字典中

Python 将文件中的字符串分组到字典中,python,python-3.x,dictionary,Python,Python 3.x,Dictionary,我正在尝试将列表中的字符串分组到字典中。我读入一个文件以获得字符串列表。我想获取该列表并按id对所有项目进行分组 这是文件(logtest.txt)包含的内容 Id: 1 FATAL ERROR: Network error: Connection timed out Done Return Code: 0 Id: 2 FATAL ERROR: Network error: Connection timed out Done Return Code: 0 Id: 3 FATAL E

我正在尝试将列表中的字符串分组到字典中。我读入一个文件以获得字符串列表。我想获取该列表并按id对所有项目进行分组

这是文件(logtest.txt)包含的内容

Id: 1

FATAL ERROR: Network error: Connection timed out
Done

Return Code: 0

Id: 2

FATAL ERROR: Network error: Connection timed out
Done

Return Code: 0

Id: 3

FATAL ERROR: Network error: Connection timed out
Done

Return Code: 0
到目前为止,我已将文件中的所有行读取到一个列表中。然后,我想将这些字符串按
id
编号分组到一个字典中,其中键是
id
编号,值是从
id:1
到下一个包含
id:
的字符串的所有字符串

def getAllTheLinesInLogFile():
    f = open('logtest.txt', 'r')
    return f.readlines()

def getDictOfItems(allLinesInFile):
    dict = {}
    # ???
    # items = allLinesInFile.groupby()
    for item in items:
        print("{0}".format(item))
    return dict

logFile = open('logtest.txt', 'w+')

allLinesInLogFile = getAllTheLinesInLogFile()
dictOfItems = getDictOfItems(allLinesInLogFile)
for item in dictOfItems:
    print(item.key)

您可以使用
itertools.groupby
对按
Id:
分隔的部分进行分组:

from itertools import groupby
with open("in.txt") as f:
    d = {}
    groups = groupby(f, lambda x: x.startswith("Id:"))
    for k, v in groups:
        if k: # if we have a line with "Id:.."
            # use the line as the key
            k = next(v).rstrip() 
            # call next on the grouper object extracting 
            # the second item which is our section of lines
            d[k] = list(map(str.rstrip, next(groups)[1]))
输入:

Id: 1
FATAL ERROR: Network error: Connection timed out
Done
Return Code: 0
Id: 2
FATAL ERROR: Network error: Connection timed out
Done
Return Code: 0
Id: 3
FATAL ERROR: Network error: Connection timed out
Done
Return Code: 0
输出:

  from pprint import pprint as pp
  {'Id: 1': ['FATAL ERROR: Network error: Connection timed out',
       'Done',
       'Return Code: 0'],
 'Id: 2': ['FATAL ERROR: Network error: Connection timed out',
       'Done',
       'Return Code: 0'],
 'Id: 3': ['FATAL ERROR: Network error: Connection timed out',
       'Done',
       'Return Code: 0']}
如果您的数据实际上有多个空行,那么代码仍然可以工作,如果您不需要这些空行,可以对它们进行过滤。如果要保留换行符,只需删除
str.rstrip
调用


如果您计划在完成某些工作后覆盖该文件,则边写边写tempfile可能是更好的方法。

我不完全清楚您的要求,但这可能会有所帮助:

with open('logtest.txt', 'r') as logFile:
    id_ = None
    dictOfItems = {}
    lines = []

    for line in logFile:
        if line.startswith("Id: "):
            if id_ is not None:
                dictOfItems[id_] = lines
                lines = []
            id_ = int(line[4:])
        else:
            lines.append(line)

for key, item in dictOfItems.items():
    print(key, item)