Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/285.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 获取数据帧的协方差返回值_Python_Pandas_Numpy - Fatal编程技术网

Python 获取数据帧的协方差返回值

Python 获取数据帧的协方差返回值,python,pandas,numpy,Python,Pandas,Numpy,我有这样一个数据帧: YAU OTBL HLE 2009-03-08 nan nan nan 2009-03-09 1.59904743 1.66397210 1.67345829 2009-03-10 -0.37065629 -0.36541822 -0.36015840 2009-03-11 -0.41055669 0.60004777 0.00536958 def

我有这样一个数据帧:

                   YAU        OTBL         HLE
2009-03-08         nan         nan         nan
2009-03-09  1.59904743  1.66397210  1.67345829
2009-03-10 -0.37065629 -0.36541822 -0.36015840
2009-03-11 -0.41055669  0.60004777  0.00536958
def get_covariance_returns(returns):
    return np.cov(returns.values, rowvar=False)
这是我的职责

def get_covariance_returns(returns):
   return np.cov(returns.values)
returns参数是一个数据帧,用于返回每个股票代码和日期。 输出是一个二维数组,表示返回的协方差

当我运行代码时,我有:

AssertionError: Wrong shape for output returns_covariance. Got (4, 4), expected (3, 3)
现在,我修改了我的函数如下:

                   YAU        OTBL         HLE
2009-03-08         nan         nan         nan
2009-03-09  1.59904743  1.66397210  1.67345829
2009-03-10 -0.37065629 -0.36541822 -0.36015840
2009-03-11 -0.41055669  0.60004777  0.00536958
def get_covariance_returns(returns):
    return np.cov(returns.values, rowvar=False)
我的结果是:

OUTPUT returns_covariance:
[[ nan  nan  nan]
 [ nan  nan  nan]
 [ nan  nan  nan]]
请注意,预期输出为:

EXPECTED OUTPUT FOR returns_covariance:
[[ 0.89856076  0.7205586   0.8458721 ]
 [ 0.7205586   0.78707297  0.76450378]
 [ 0.8458721   0.76450378  0.83182775]]
我需要一个指南来了解我的实现出了什么问题。我正在用Python语言编程。

如果您删除
NaN
s:

>>> np.cov(df.dropna().values, rowvar=False)
array([[ 1.31997225,  1.01614032,  1.2238726 ],
       [ 1.01614032,  1.0304141 ,  1.04243784],
       [ 1.2238726 ,  1.04243784,  1.17528792]])
或者更简单地说,使用自动转换为NaN的:

>>> df.cov()
           YAU      OTBL       HLE
YAU   1.319972  1.016140  1.223873
OTBL  1.016140  1.030414  1.042438
HLE   1.223873  1.042438  1.175288
[编辑]:根据您的预期输出,您实际上正在将
NaN
替换为零:

>>> np.cov(df.replace(np.nan, 0).values, rowvar=False)
array([[ 0.89856076,  0.7205586 ,  0.8458721 ],
       [ 0.7205586 ,  0.78707297,  0.76450378],
       [ 0.8458721 ,  0.76450378,  0.83182775]])

>>> df.replace(np.nan, 0).cov()
           YAU      OTBL       HLE
YAU   0.898561  0.720559  0.845872
OTBL  0.720559  0.787073  0.764504
HLE   0.845872  0.764504  0.831828
无论如何,我将离开我原来的帖子,因为它显示了两个
cov
函数之间的区别:
df.fillna(0.cov()