Numpy:多元正态分布的对数似然

Numpy:多元正态分布的对数似然,numpy,Numpy,我想计算多元正态分布的对数似然 数据: 可能性(我遵循): 它返回以下错误: > likelihood(mean, cov) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-54-3b20c2eefea4> i

我想计算多元正态分布的对数似然

数据:

可能性(我遵循):

它返回以下错误:

> likelihood(mean, cov)
---------------------------------------------------------------------------
ValueError                         Traceback (most recent call last)
<ipython-input-54-3b20c2eefea4> in <module>()
----> 1 likelihood(mean, cov)

<ipython-input-53-8a2a7219131c> in likelihood(mean, cov)
      2     # param = mean[y1, y2], cov = [[c1, 0], [0, c2]]
      3 
----> 4     loglikelihood = -0.5*(  np.log(np.linalg.det(cov))  + (data - mean).transpose() * np.linalg.inv(cov) * (data - mean) + 2 * np.log(2 * np.pi)  )
      5 
      6     loglikelihoodsum = loglikelihood.sum()

ValueError: operands could not be broadcast together with shapes (2,1000) (2,2) 
>可能性(平均值,cov)
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
---->1可能性(平均值,cov)
可能性(平均值,cov)
2#param=平均值[y1,y2],cov=[[c1,0],[0,c2]]
3.
---->4对数似然=-0.5*(np.log(np.linalg.det(cov))+(data-mean.transpose()*np.linalg.inv(cov)*(data-mean)+2*np.log(2*np.pi))
5.
6 loglikelihoodsum=loglikelike.sum()
ValueError:操作数无法与形状(21000)(2,2)一起广播

如何修复它?

numpy
中使用
*
运算符进行乘法是指元素乘法。您希望使用
np.einsum
计算内积:

mean = np.random.normal(0, 1, 2)
cov = np.random.normal(0, 1, (2, 2))
data = np.random.normal(0, 1, (1000, 2))

residuals = data - mean
loglikelihood = -0.5 * (
    np.log(np.linalg.det(cov)) 
    + np.einsum('...j,jk,...k', residuals, np.linalg.inv(cov), residuals) 
    + len(mean) * np.log(2 * np.pi)
)
np.sum(loglikelihood)
我找到了一个答案:

def likelihood(mean, cov): # Wikipedia
    def calc_loglikelihood(residuals):
        return -0.5 * (np.log(np.linalg.det(cov)) + residuals.T.dot(np.linalg.inv(cov)).dot(residuals) + 2 * np.log(2 * np.pi))

    # mean = np.array([y1, y2]), cov = np.array([[c1, 0], [0, c2]])
    residuals = (data - mean)

    loglikelihood = np.apply_along_axis(calc_loglikelihood, 1, residuals)
    loglikelihoodsum = loglikelihood.sum()

    return loglikelihoodsum

您可以使用scipy.stats.multivariable_normal.logpdf

谢谢,但我仍然得到以下错误:
shapes(21000)和(2,2)未对齐:1000(dim 1)!=2(尺寸0)
mean = np.random.normal(0, 1, 2)
cov = np.random.normal(0, 1, (2, 2))
data = np.random.normal(0, 1, (1000, 2))

residuals = data - mean
loglikelihood = -0.5 * (
    np.log(np.linalg.det(cov)) 
    + np.einsum('...j,jk,...k', residuals, np.linalg.inv(cov), residuals) 
    + len(mean) * np.log(2 * np.pi)
)
np.sum(loglikelihood)
def likelihood(mean, cov): # Wikipedia
    def calc_loglikelihood(residuals):
        return -0.5 * (np.log(np.linalg.det(cov)) + residuals.T.dot(np.linalg.inv(cov)).dot(residuals) + 2 * np.log(2 * np.pi))

    # mean = np.array([y1, y2]), cov = np.array([[c1, 0], [0, c2]])
    residuals = (data - mean)

    loglikelihood = np.apply_along_axis(calc_loglikelihood, 1, residuals)
    loglikelihoodsum = loglikelihood.sum()

    return loglikelihoodsum