Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/http/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将多维多索引数据帧与单索引数据帧随时间相乘_Python_Pandas_Dataframe_Numpy - Fatal编程技术网

Python 将多维多索引数据帧与单索引数据帧随时间相乘

Python 将多维多索引数据帧与单索引数据帧随时间相乘,python,pandas,dataframe,numpy,Python,Pandas,Dataframe,Numpy,我是Python新手,正在寻找帮助以使2个数据帧随时间增加。如果您能帮助理解错误,我们将不胜感激 第一数据帧(cov) 第二数据帧(w) 守则: std = np.dot(np.transpose(w) , np.matmul(cov , w)) 错误: ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,

我是Python新手,正在寻找帮助以使2个数据帧随时间增加。如果您能帮助理解错误,我们将不胜感激

第一数据帧(cov)

第二数据帧(w)

守则:

std = np.dot(np.transpose(w) , np.matmul(cov , w)) 
错误:

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 12361 is different from 10)
我只显示数据帧的小片段。原始cov数据框为123610行×10列,w数据框为12361行×10列

预期产出:

Date           
2018-12-27     44.45574103083
2018-12-28     46.593367859
2018-12-31     45.282932300

非常感谢

我认为您可以在
日期
级别上使用
groupby
,然后将
w
中与组中日期对应的权重相乘:

cov.groupby(level='Date').apply(lambda g: w.loc[g.name].dot(g.values@(w.loc[g.name])))
由于三维数组能更好地表示数据,您还可以避免
apply
中组上的隐式循环,并使用:

性能方面,第二种解决方案似乎更好:

%timeit cov.groupby(level='Date').apply(lambda g: w.loc[g.name].dot(g.values@(w.loc[g.name])))
4.74 ms ± 614 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit np.einsum('ik,ik->i', w.values, np.einsum('ijk,ik->ij', reshaped, w.values))
35.6 µs ± 5.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
数据:


对这很有效,非常感谢!很高兴它帮助了@BjarneTimm。请随意接受答案。你认为你也能解决这个问题吗?
cov.groupby(level='Date').apply(lambda g: w.loc[g.name].dot(g.values@(w.loc[g.name])))
reshaped = cov.values.reshape(cov.index.levels[0].nunique(), cov.index.levels[1].nunique(), cov.shape[-1])
np.einsum('ik,ik->i', w.values, np.einsum('ijk,ik->ij', reshaped, w.values))
%timeit cov.groupby(level='Date').apply(lambda g: w.loc[g.name].dot(g.values@(w.loc[g.name])))
4.74 ms ± 614 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit np.einsum('ik,ik->i', w.values, np.einsum('ijk,ik->ij', reshaped, w.values))
35.6 µs ± 5.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cov = pd.DataFrame.from_dict({'NoDur': {('2018-12-27', 'NoDur'): 0.000109,
  ('2018-12-27', 'Durbl'): 0.000112,
  ('2018-12-27', 'Manuf'): 0.000118,
  ('2018-12-28', 'NoDur'): 0.000109,
  ('2018-12-28', 'Durbl'): 0.000113,
  ('2018-12-28', 'Manuf'): 0.000117,
  ('2018-12-31', 'NoDur'): 0.000109,
  ('2018-12-31', 'Durbl'): 0.000113,
  ('2018-12-31', 'Manuf'): 0.000118},
 'Durbl': {('2018-12-27', 'NoDur'): 0.000112,
  ('2018-12-27', 'Durbl'): 0.000339,
  ('2018-12-27', 'Manuf'): 0.000238,
  ('2018-12-28', 'NoDur'): 0.000113,
  ('2018-12-28', 'Durbl'): 0.000339,
  ('2018-12-28', 'Manuf'): 0.000239,
  ('2018-12-31', 'NoDur'): 0.000113,
  ('2018-12-31', 'Durbl'): 0.000339,
  ('2018-12-31', 'Manuf'): 0.000239},
 'Manuf': {('2018-12-27', 'NoDur'): 0.000118,
  ('2018-12-27', 'Durbl'): 0.000238,
  ('2018-12-27', 'Manuf'): 0.000246,
  ('2018-12-28', 'NoDur'): 0.000117,
  ('2018-12-28', 'Durbl'): 0.000239,
  ('2018-12-28', 'Manuf'): 0.000242,
  ('2018-12-31', 'NoDur'): 0.000118,
  ('2018-12-31', 'Durbl'): 0.000239,
  ('2018-12-31', 'Manuf'): 0.000245}})

 w = pd.DataFrame.from_dict({'NoDur': {'2018-12-27': -69.190732,
  '2018-12-28': -113.83175,
  '2018-12-31': -101.365016},
 'Durbl': {'2018-12-27': -96.316224,
  '2018-12-28': 30.426696,
  '2018-12-31': -16.613136},
 'Manuf': {'2018-12-27': -324.058486,
  '2018-12-28': -410.055587,
  '2018-12-31': -362.232014}})