Python中创建共现矩阵的奇怪错误_Python_Numpy_Matrix

Python中创建共现矩阵的奇怪错误

python numpy matrix

Python中创建共现矩阵的奇怪错误,python,numpy,matrix,Python,Numpy,Matrix,我试图用Python创建一个共现矩阵，输出L1中的单词在pears猫狗、猫屋、猫树e.t.c中出现的数字。在L2中，我的代码到目前为止是： co = np.zeros((5,5)) #the matrix L1 = ['cat', 'dog', 'house', 'tree', 'car'] #tags L2 = ['cat car dog', 'cat house dog', 'cat car', 'cat dog'] #photo text n=0 # will hold the sum

我试图用Python创建一个共现矩阵，输出L1中的单词在pears猫狗、猫屋、猫树e.t.c中出现的数字。在L2中，我的代码到目前为止是：

co = np.zeros((5,5)) #the matrix
L1 = ['cat', 'dog', 'house', 'tree', 'car'] #tags
L2 = ['cat car dog', 'cat house dog', 'cat car', 'cat dog'] #photo text

n=0 # will hold the sum of each occurance

for i in range(len(L1)):
    for j in range(len(L1)):
        for s in range(len(L2)):
            #find occurrence but not on same words
            if L1[i] in L2[s] and L1[j] in L2[s] and L1[i] != L1[j]: 
                n+=1  # sum the number of occurances            
                #output = L1[i], L1[j] # L2[s]
                #print output
                co[i][j] = s #add to the matrix

print co

输出应该是

[[ 0.  3.  1.  0.  2.]
 [ 3.  0.  1.  0.  1.]
 [ 1.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 2.  1.  0.  0.  0.]]

但是：

[[ 0.  3.  1.  0.  2.]
 [ 3.  0.  1.  0.  0.]
 [ 1.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 2.  0.  0.  0.  0.]]

每隔一行就有一个错误。。。如果部分工作正常，我已经检查了输出：

output = L1[i], L1[j] # L2[s]
print output
    ('cat', 'dog')
    ('cat', 'dog')
    ('cat', 'dog')
    ('cat', 'house')
    ('cat', 'car')
    ('cat', 'car')
    ('dog', 'cat')
    ('dog', 'cat')
    ('dog', 'cat')
    ('dog', 'house')
    ('dog', 'car')
    ('house', 'cat')
    ('house', 'dog')
    ('car', 'cat')
    ('car', 'cat')
    ('car', 'dog')

所以我猜在提交矩阵时发生了什么事

co[i][j] = s

有什么建议吗？

它给出了一个正确的结果，因为在L2的第一项中有汽车和狗，索引为0

下面是一种更具python风格的方法，该方法基于L2中成对的首次出现来获取索引：

它给出了一个正确的结果，因为在L2的第一项中有汽车和狗，它的索引为0

下面是一种更具python风格的方法，该方法基于L2中成对的首次出现来获取索引：

尝试使用co[i，j]=s。得到的矩阵是对称的。因此，您可以使用for i in rangelenL1:for j in rangei:for s in rangelenL2:这将为您提供三角矩阵。然后你可以在主对角线上做一个镜像。谢谢盖伊。我遵循了你的建议，co[I，j]=s和co[I][j]之间有什么区别？如果其中一对出现在L2中的两个项目中呢？试着使用co[I，j]=s。得到的矩阵是对称的。因此，您可以使用for i in rangelenL1:for j in rangei:for s in rangelenL2:这将为您提供三角矩阵。然后你可以在principal对角线上做一个镜像拷贝谢谢guy我遵照你的建议，co[I，j]=s和co[I][j]之间有什么区别？如果其中一对出现在L2中的两个项目中呢？

In [158]: L2 = ['cat car dog', 'cat house dog', 'cat car', 'cat dog']

In [159]: L2 = [s.split() for s in L2]

In [160]: combinations = np.column_stack((np.repeat(L1, 5), np.tile(L1, 5))).reshape(5, 5, 2)
# with 0 as the start of the indices
In [162]: [[next((i for i, sub in enumerate(L2) if x in sub and y in sub), 0) for x, y in row] for row in combinations]
Out[162]: 
[[0, 0, 1, 0, 0],
 [0, 0, 1, 0, 0],
 [1, 1, 1, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0]]
# with 1 as the start of the indices
In [163]: [[next((i for i, sub in enumerate(L2, 1) if x in sub and y in sub), 0) for x, y in row] for row in combinations]
Out[163]: 
[[1, 1, 2, 0, 1],
 [1, 1, 2, 0, 1],
 [2, 2, 2, 0, 0],
 [0, 0, 0, 0, 0],
 [1, 1, 0, 0, 1]]