Python 生成相关随机样本

Python 生成相关随机样本,python,matrix,random,correlation,eigenvalue,Python,Matrix,Random,Correlation,Eigenvalue,我读了这篇教程,是关于如何得到一个矩阵C,使C*C^T=R,其中R是一个给定的协方差矩阵。代码示例实现了两种不同的方法,Cholesky分解或使用特征值。 令我惊讶的是,打印两种不同方法的结果C会得到两种不同的矩阵: 特征值法结果: [[ 0.11928653 -0.86036701 1.6265114 ] [ 0.00835653 -0.89810227 -2.16641235] [ 0.18832863 0.58480336 -0.93409708]] [[ 1.84390889

我读了这篇教程,是关于如何得到一个矩阵C,使C*C^T=R,其中R是一个给定的协方差矩阵。代码示例实现了两种不同的方法,Cholesky分解或使用特征值。 令我惊讶的是,打印两种不同方法的结果C会得到两种不同的矩阵:

特征值法结果:

[[ 0.11928653 -0.86036701  1.6265114 ]
 [ 0.00835653 -0.89810227 -2.16641235]
 [ 0.18832863  0.58480336 -0.93409708]]
[[ 1.84390889  0.          0.        ]
 [-1.4913969   1.80989925  0.        ]
 [-1.08465229 -0.06500199  0.26325682]]
Cholesky方法结果:

[[ 0.11928653 -0.86036701  1.6265114 ]
 [ 0.00835653 -0.89810227 -2.16641235]
 [ 0.18832863  0.58480336 -0.93409708]]
[[ 1.84390889  0.          0.        ]
 [-1.4913969   1.80989925  0.        ]
 [-1.08465229 -0.06500199  0.26325682]]
如果有人能向我解释为什么这两个矩阵是不同的,我将非常感激

生成结果的代码:

"""Example of generating correlated normally distributed random samples."""

import numpy as np
from scipy.linalg import eigh, cholesky
from scipy.stats import norm

from pylab import plot, show, axis, subplot, xlabel, ylabel, grid


# Choice of cholesky or eigenvector method.
method = 'cholesky'
#method = 'eigenvectors'

num_samples = 400

# The desired covariance matrix.
r = np.array([
        [  3.40, -2.75, -2.00],
        [ -2.75,  5.50,  1.50],
        [ -2.00,  1.50,  1.25]
    ])

# Generate samples from three independent normally distributed random
# variables (with mean 0 and std. dev. 1).
x = norm.rvs(size=(3, num_samples))

# We need a matrix `c` for which `c*c^T = r`.  We can use, for example,
# the Cholesky decomposition, or the we can construct `c` from the
# eigenvectors and eigenvalues.

if method == 'cholesky':
    # Compute the Cholesky decomposition.
    c = cholesky(r, lower=True)
else:
    # Compute the eigenvalues and eigenvectors.
    evals, evecs = eigh(r)
    # Construct c, so c*c^T = r.
    c = np.dot(evecs, np.diag(np.sqrt(evals)))

# Convert the data to correlated random variables. 
y = np.dot(c, x)

#
# Plot various projections of the samples.
#
subplot(2,2,1)
plot(y[0], y[1], 'b.')
ylabel('y[1]')
axis('equal')
grid(True)

subplot(2,2,3)
plot(y[0], y[2], 'b.')
xlabel('y[0]')
ylabel('y[2]')
axis('equal')
grid(True)

subplot(2,2,4)
plot(y[1], y[2], 'b.')
xlabel('y[1]')
axis('equal')
grid(True)

show()

我想我同时找到了答案。
协方差矩阵R到R=CC^T的矩阵分解不是明确的。计算Cholesky分解或使用特征值的不同方法会产生符合公式R=CC^T的不同矩阵C。

正如您自己所说,将R分解为CC^T并不是唯一的。满足此表达式的可能C多于1个

要看到这一点,让我们

U @ S @ U.T = R = L @ L.T
其中“@”是对numpy数组执行矩阵乘法的运算符,“U.T”转置矩阵U

给定任意可逆矩阵Q,其中Q@np.inv(Q)=I

事实上,我们可以求解一个特定的矩阵Q,这样

U = evecs @ np.diag(np.sqrt(evals))) 
L = cholesky(r, lower=True)
U @ Q = L
为此,

import numpy as np

r = np.array([
        [  3.40, -2.75, -2.00],
        [ -2.75,  5.50,  1.50],
        [ -2.00,  1.50,  1.25]
    ])

evals, evecs = eigh(r)
U = evecs @ np.diag(np.sqrt(evals))
L = cholesky(r, lower=True)
Q = np.linalg.pinv(U) @ L
U_new = U @ Q
print("U_new:")
print(U_new)
print("L:")
print(L)
print("U_new == L up to numerical differences: {}".format(np.allclose(U_new, L))
输出是

U_new:
[[ 1.84390889e+00  1.87763803e-17 -1.53304396e-17]
 [-1.49139690e+00  1.80989925e+00  4.13531074e-19]
 [-1.08465229e+00 -6.50019933e-02  2.63256819e-01]]
L:
[[ 1.84390889  0.          0.        ]
 [-1.4913969   1.80989925  0.        ]
 [-1.08465229 -0.06500199  0.26325682]]
U_new == L up to numerical differences: True

您能同时显示生成这两个结果的代码或命令吗?我在中添加了代码(也可以在源代码链接中找到)。不同的结果是通过注释/取消注释方法=‘特征值’产生的