python将两个字典合并到嵌套字典(文本相似性)

python将两个字典合并到嵌套字典(文本相似性),python,dictionary,nested,Python,Dictionary,Nested,我有以下文件: documents = ["Human machine interface for lab abc computer applications", "A survey of user opinion of computer system response time", "The EPS user interface management system", "System and human sys

我有以下文件:

documents = ["Human machine interface for lab abc computer applications",
              "A survey of user opinion of computer system response time",
              "The EPS user interface management system",
              "System and human system engineering testing of EPS",
              "Relation of user perceived response time to error measurement",
              "The generation of random binary unordered trees",
              "The intersection graph of paths in trees",
              "Graph minors IV Widths of trees and well quasi ordering",
             "Graph minors A survey"]
从中我构建了一个单词矩阵:

wordmatrix = []
wordmatrix = [sentences.split(" ") for sentences in documents]
对于输出:

人类、机器、接口、实验室、abc、计算机、, “应用程序”]、[“A”、“调查”、“of”、“用户”、“意见”、“of”, '计算机'、'系统'、'响应'、'时间']、['The'、'EPS'、'用户', “接口”、“管理”、“系统”]、[“系统”、“人”, “系统”、“工程”、“测试”、“of”、“EPS”]、[“关系”、“of”, ‘用户’、‘感知’、‘响应’、‘时间’、‘到’、‘错误’, “度量”]、[“生成”、“随机”、“二进制”, “无序”、“树”]、[“交叉点”、“图形”、“of”、“路径”, “in”,“trees”],[“Graph”,“subjector”,“IV”,“Widths”,“of”,“trees”, “and”、“well”、“quasi”、“ordering”]、[“Graph”、“subjector”、“A”, “调查”]]

接下来,我想创建一个字典,每个文档都有一个键,单词作为键,单词在文档中出现的频率作为值。

但我只走了这么远:

初始化字典

dic1 = {}
dic2 = {}
d = {}
第一个字典为每个文档提供一个键:

dic1 = dict(enumerate(sentence for sentence in wordmatrix))
对于输出:

{0:['Human'、'machine'、'interface'、'for'、'lab'、'abc'、'computer', '应用'],1:['A','survey','of','user','opinion','of',', '计算机','系统','响应','时间'],2:['The','EPS','user', “接口”、“管理”、“系统”]、3:[“系统”、“人”, ‘系统’、‘工程’、‘测试’、‘of’、‘EPS’],4:[‘关系’, 'of'、'user'、'sensived'、'response'、'time'、'to'、'error', '测量'],5:['The','generation','of','random','binary',', '无序','树'],6:['The','cross','graph','of', '路径','中','树'],7:['图','子','四','宽度','of', '树','和','井','准','序'],8:['图','子', “A”,“调查”]}

第二本字典,每个单词都有一个键:

for sentence in wordmatrix:
    for word in sentence:
        dic2[word] = dic2.get(word, 0) + 1
对于输出:

{'Human':1,'machine':1,'interface':2,'for':1,'lab':1,'abc': 1,“计算机”:2,“应用”:1,“A”:2,“调查”:2,“共”:7, “用户”:3,“意见”:1,“系统”:3,“响应”:2,“时间”:2,“结果”: 3,‘EPS’:2,‘管理’:1,‘系统’:1,‘和’:2,‘人’:1, “工程”:1,“测试”:1,“关系”:1,“感知”:1,“到”: 1,“错误”:1,“测量”:1,“生成”:1,“随机”:1, “二进制”:1,“无序”:1,“树”:3,“交集”:1,“图形”: 1,‘路径’:1,‘in’:1,‘图形’:2,‘子项’:2,‘IV’:1,‘宽度’:1, “井”:1,“准”:1,“排序”:1}

但是,我想将两个词典合并到一个词典中,它应该如下所示: {0:{'Human':1,'machine':1,'interface':2,…..},1:(依此类推)}


谢谢

您不必组合两个dict,只有当您有
dic2
时,才可以使用
dic2
构建一个新的dict

for line_num, sentence in enumerate(wordmatrix):
    dic1[line_num] = {}
    for word in sentence:
        dic1[line_num][word] = dic2[word]