Python 有人能解释一下这个清单吗?
映射索引到单词是单词词典:几千个单词的索引。 tf_idf是TFIDF稀疏向量 DataFrame wiki显示在此处的屏幕截图中Python 有人能解释一下这个清单吗?,python,numpy,machine-learning,Python,Numpy,Machine Learning,映射索引到单词是单词词典:几千个单词的索引。 tf_idf是TFIDF稀疏向量 DataFrame wiki显示在此处的屏幕截图中 def unpack_dict(matrix, map_index_to_word): table = sorted(map_index_to_word, key=map_index_to_word.get) data = matrix.data indices = matrix.indices indptr = matr
def unpack_dict(matrix, map_index_to_word):
table = sorted(map_index_to_word, key=map_index_to_word.get)
data = matrix.data
indices = matrix.indices
indptr = matrix.indptr
num_doc = matrix.shape[0]
return [{k:v for k,v in zip([table[word_id] for word_id in
indices[indptr[i]:indptr[i+1]] ],
data[indptr[i]:indptr[i+1]].tolist())} \
for i in range(num_doc) ]
wiki['tf_idf'] = unpack_dict(tf_idf, map_index_to_word)
同:
[{k: v for k, v in zip([table[word_id] for word_id in indices[indptr[i]:indptr[i + 1]]],data[indptr[i]:indptr[i + 1]].tolist())} for i in range(num_doc)]
同:
[{k: v for k, v in zip([table[word_id] for word_id in indices[indptr[i]:indptr[i + 1]]],data[indptr[i]:indptr[i + 1]].tolist())} for i in range(num_doc)]
这个
外部理解是
[{k:v for k,v in zip([table[word_id] for word_id in
indices[indptr[i]:indptr[i+1]] ],
data[indptr[i]:indptr[i+1]].tolist())} \
for i in range(num_doc) ]
只是一个简单的循环num\u doc
次
里面是一本字典
[... for i in range(num_doc) ]
zip
从以下位置获取k
键:
{k:v for k,v in zip()}
和v
值来自:
[table[word_id] for word_id in indices[indptr[i]:indptr[i+1]] ]
因此,i
,外部变量创建切片范围,indptr[i]:indptr[i+1]
因此,它正在制作一个字典列表。字典键来自表[word\u id]
,其中word\u id
位于索引的范围内,该值是数据的相应范围
外部理解是
[{k:v for k,v in zip([table[word_id] for word_id in
indices[indptr[i]:indptr[i+1]] ],
data[indptr[i]:indptr[i+1]].tolist())} \
for i in range(num_doc) ]
只是一个简单的循环num\u doc
次
里面是一本字典
[... for i in range(num_doc) ]
zip
从以下位置获取k
键:
{k:v for k,v in zip()}
和v
值来自:
[table[word_id] for word_id in indices[indptr[i]:indptr[i+1]] ]
因此,i
,外部变量创建切片范围,indptr[i]:indptr[i+1]
因此,它正在制作一个字典列表。字典键来自表[word\u id]
,其中word\u id
位于索引的范围内,该值是数据的相应范围