Python 多元正态分布的平均向量计算
我很难将多元高斯分布拟合到我的数据集,更具体地说,就是找一个平均向量(或多个平均向量)。我的数据集是一个N x 8矩阵,目前我正在使用以下代码:Python 多元正态分布的平均向量计算,python,numpy,machine-learning,Python,Numpy,Machine Learning,我很难将多元高斯分布拟合到我的数据集,更具体地说,就是找一个平均向量(或多个平均向量)。我的数据集是一个N x 8矩阵,目前我正在使用以下代码: muVector=np.mean(Xtrain,axis=0)其中Xtrain是我的训练数据集 对于协方差,我使用任意方差值(.5)构建它,并执行以下操作: 协方差=np.dot(.5,np.eye(N,N)其中N是观察数 但当我构造Phi矩阵时,我得到了所有的零。下面是我的代码: muVector = np.mean(Xtrain, axis=0)
muVector=np.mean(Xtrain,axis=0)
其中Xtrain是我的训练数据集
对于协方差,我使用任意方差值(.5)构建它,并执行以下操作:
协方差=np.dot(.5,np.eye(N,N)
其中N是观察数
但当我构造Phi矩阵时,我得到了所有的零。下面是我的代码:
muVector = np.mean(Xtrain, axis=0)
# get covariance matrix from Xtrain
cov = np.dot(var, np.eye(N,N))
cov = np.linalg.inv(cov)
# build Xtrain Phi
Phi = np.ones((N,M))
for row in range(N):
temp = Xtrain[row,:] - muVector
temp.shape = (1,M)
temp = np.dot((-.5), temp)
temp = np.dot(temp, cov)
temp = np.dot(temp, (Xtrain[row,:] - muVector))
Phi[row,:] = np.exp(temp)
非常感谢您的帮助。我想我可能必须使用np.random.multivariable_normal()?但我不知道在这种情况下如何使用它。通过“Phi”,我相信您是指您要估计的概率密度函数(pdf)。在这种情况下,协方差矩阵应该是MxM,输出Phi将是Nx1:
# -*- coding: utf-8 -*-
import numpy as np
N = 1024
M = 8
var = 0.5
# Creating a Xtrain NxM observation matrix.
# Its muVector is [0, 1, 2, 3, 4, 5, 6, 7] and the variance for all
# independent random variables is 0.5.
Xtrain = np.random.multivariate_normal(np.arange(8), np.eye(8,8)*var, N)
# Estimating the mean vector.
muVector = np.mean(Xtrain, axis=0)
# Creating the estimated covariance matrix and its inverse.
cov = np.eye(M,M)*var
inv_cov = np.linalg.inv(cov)
# Normalization factor from the pdf.
norm_factor = 1/np.sqrt((2*np.pi)**M * np.linalg.det(cov))
# Estimating the pdf.
Phi = np.ones((N,1))
for row in range(N):
temp = Xtrain[row,:] - muVector
temp.shape = (1,M)
temp = np.dot(-0.5*temp, inv_cov)
temp = np.dot(temp, (Xtrain[row,:] - muVector))
Phi[row] = norm_factor*np.exp(temp)
或者,您可以使用scipy.stats.multivariable\u normal
中的pdf
方法:
# -*- coding: utf-8 -*-
import numpy as np
from scipy.stats import multivariate_normal
N = 1024
M = 8
var = 0.5
# Creating a Xtrain NxM observation matrix.
# Its muVector is [0, 1, 2, 3, 4, 5, 6, 7] and the variance for all
# independent random variables is 0.5.
Xtrain = np.random.multivariate_normal(np.arange(8), np.eye(8,8)*var, N)
# Estimating the mean vector.
muVector = np.mean(Xtrain, axis=0)
# Creating the estimated covariance matrix.
cov = np.eye(M,M)*var
Phi2 = multivariate_normal.pdf(Xtrain, mean=muVector, cov=cov)
Phi
和Phi2
输出数组将是相等的。非常感谢!这正是我要找的。