Python 用numpy方法计算核矩阵

Python 用numpy方法计算核矩阵,python,numpy,array-broadcasting,Python,Numpy,Array Broadcasting,我有一个形状为d X N的数据(每列都是特征向量) 我有以下用于计算内核矩阵的代码: def kernel(x1, x2): return x1.T @ x2 data = np.array([[1,2,3], [1,2,3], [1,2,3]]) result = [] for i in range(data.shape[1]): current_result = [] for j in range(data.shape[1]): x1 = data[:, i]

我有一个形状为d X N的数据(每列都是特征向量) 我有以下用于计算内核矩阵的代码:

def kernel(x1, x2):
  return x1.T @ x2

data = np.array([[1,2,3], [1,2,3], [1,2,3]])
result = []
for i in range(data.shape[1]):
  current_result = []
  for j in range(data.shape[1]):
    x1 = data[:, i]
    x2 = data[:, j]
    current_result.append(kernel(x1, x2))
  result.append(current_result)

np.array(result)
我得到了这个结果:

array([[ 3,  6,  9],
       [ 6, 12, 18],
       [ 9, 18, 27]])
问题是这段代码太慢,所以我尝试使用np.vectorize:

vec = np.vectorize(kernel, signature='(n),(n)->()')
vec(data, data)
但我得到了错误的结果:

array([14, 14, 14])

我做错了什么?

当测试问题的更大维度和随机数以确保稳健性时,例如使用维度
(100200)
,有几种方法:

import numpy as np

def kernel(x1, x2):
    return x1.T @ x2

def kernel_kenny(a):
    result = []
    for i in range(a.shape[1]):
      current_result = []
      for j in range(a.shape[1]):
        x1 = a[:, i]
        x2 = a[:, j]
        current_result.append(kernel(x1, x2))
    
      result.append(current_result)

    return np.array(result)

a = np.random.random((100,200))

res1 = kernel_kenny(a)

# perhaps einsum signature might help you to understand the calculations
res2 = np.einsum('ji,jk->ik', a, a, optimize=True)
# or the following if you want to explicitly specify the transpose
# res2 = np.einsum('ij,jk->ik', a.T, a, optimize=True)
    
# or simply ...
res3 = a.T @ a
Hera是健康检查:

np.allclose(res1,res2)
>>> True

np.allclose(res1,res3)
>>> True
和时间:

%timeit kernel_kenny(a)
>>> 83.2 ms ± 425 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit np.einsum('ji,jk->ik', a, a, optimize=True)
>>> 325 µs ± 4.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit a.T @ a
>>> 82 µs ± 9.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

np.vectorize
并不是为了让代码更快而设计的。如果你想用它来加速你的代码,这是在浪费时间。有没有其他方法来计算每对列向量上的内核,使用numpy方法(或广播)?