如何用python将嵌套字典写入格式整齐的矩阵?
我创建了一个嵌套字典,它表示POS标记符的混淆矩阵,它看起来像:如何用python将嵌套字典写入格式整齐的矩阵?,python,dictionary,matrix,Python,Dictionary,Matrix,我创建了一个嵌套字典,它表示POS标记符的混淆矩阵,它看起来像: file = open(outFile, 'w+') matrix = defaultdict(lambda: defaultdict(int)) for s in range(len(self.goldenTags)): for w in range(len(self.goldenTags[s])): matrix[self.goldenTags[s][w].t
file = open(outFile, 'w+')
matrix = defaultdict(lambda: defaultdict(int))
for s in range(len(self.goldenTags)):
for w in range(len(self.goldenTags[s])):
matrix[self.goldenTags[s][w].tag][self.myTags[s][w].tag] += 1
在不使用任何库的情况下,仍然可以使用列表创建csv样式的输出
Tag Tag Tag Tag Tag
Tag 1 0 2 inf 4
Tag 4 2 0 1 5
Tag inf inf 1 0 3
Tag 3 4 5 3 0
使用.xml、.json或.ini代替“重新发明轮子”。大量的图书馆可用于这些和更多。有关简单示例,请使用查看。
将其放入一个生成字符串的函数中,而不是打印字符串;迭代函数,将返回值写入文件。什么是“整洁格式”?@PatrickHaugh抱歉。我添加了一个编辑。像这样的东西太好了。你们可能想看看这个问题,寻找一些有趣的选择。
Tag Tag Tag Tag Tag
Tag 1 0 2 inf 4
Tag 4 2 0 1 5
Tag inf inf 1 0 3
Tag 3 4 5 3 0
# create a nested dictionary
d = {'x': {'v1':4, 'v2':5, 'v3':12},
'y':{'v1':2, 'v2':1, 'v3':11},
'z':{'v2':5, 'v3':1}}
# get all of the row and column ids
row_ids = sorted(d.keys())
col_ids = sorted(set(k for v in d.values() for k in v.keys()))
# create an empty list and fill it with the header and then the rows
out = []
# header
out.append(['']+col_ids)
for r in row_ids:
out.append([r]+[d[r].get(c, 0) for c in col_ids])
out
# returns
[['', 'v1', 'v2', 'v3'],
['x', 4, 5, 12],
['y', 2, 1, 11],
['z', 0, 5, 1]]
d = {'VBP':{'CD': 4,'FW': 1,'JJ': 5,'NN': 61,'NNP': 6,'NNPS': 1,
'SYM': 2,'VB': 72,'VBD': 5,'VBG': 2,'VBZ': 1},
'xyz':{'CD': 4,'FW': 1,'JJS': 1,'NN': 61,'NNP': 6,'NNPS': 1,
'UH': 19,'VB': 72,'VBD': 5,'VBP': 537,'VBZ': 1}}
# find all the columns and all the rows, sort them
columns = sorted(set(key for dictionary in d.values() for key in dictionary))
rows = sorted(d)
# figure out how wide each column is
col_width = max(max(len(thing) for thing in columns),
max(len(thing) for thing in rows)) + 3
# preliminary format string : one column with specific width, right justified
fmt = '{{:>{}}}'.format(col_width)
# format string for all columns plus a 'label' for the row
fmt = fmt * (len(columns) + 1)
# print the header
print(fmt.format('', *columns))
# print the rows
for row in rows:
dictionary = d[row]
s = fmt.format(row, *(dictionary.get(col, 'inf') for col in columns))
print(s)
>>>
CD FW JJ JJS NN NNP NNPS SYM UH VB VBD VBG VBP VBZ
VBP 4 1 5 inf 61 6 1 2 inf 72 5 2 inf 1
xyz 4 1 inf 1 61 6 1 inf 19 72 5 inf 537 1
>>>