Python 如何设置数据结构以使熊猫和numpy合作?
编译基于pandas和numpy的代码时遇到问题。我将试图通过提供一个缩小规模的工作示例来解释问题所在 我基本上想做的是马科维茨投资组合优化,方法如下Python 如何设置数据结构以使熊猫和numpy合作?,python,python-3.x,pandas,numpy,Python,Python 3.x,Pandas,Numpy,编译基于pandas和numpy的代码时遇到问题。我将试图通过提供一个缩小规模的工作示例来解释问题所在 我基本上想做的是马科维茨投资组合优化,方法如下 df = pd.DataFrame() df['AAPL'] = [1.2,1.4,1.5] df['GOOGL'] = [2.1,2.4,2.6] df['DATE'] = ['2017-01-01', '2017-01-02','2017-01-03'] df = df.set_index('DATE') 首先,我有一个pandas.Dat
df = pd.DataFrame()
df['AAPL'] = [1.2,1.4,1.5]
df['GOOGL'] = [2.1,2.4,2.6]
df['DATE'] = ['2017-01-01', '2017-01-02','2017-01-03']
df = df.set_index('DATE')
首先,我有一个pandas.Dataframe,它以下面的方式显示给定股票的收盘价
df = pd.DataFrame()
df['AAPL'] = [1.2,1.4,1.5]
df['GOOGL'] = [2.1,2.4,2.6]
df['DATE'] = ['2017-01-01', '2017-01-02','2017-01-03']
df = df.set_index('DATE')
接下来,我想创建一些基本统计信息,以便在某些函数中传递数据,具体操作如下:
returns = df.pct_change()
mean_returns = returns.mean()
cov_matrix = returns.cov()
num_portfolios = 10
risk_free_rate = 0.0178
这些统计数据的类型为:
pandas.core.series.Series
pandas.core.frame.DataFrame
以下功能是问题开始出现的地方:
def portfolio_annualised_performance(weights, mean_returns, cov_matrix):
returns = np.sum(mean_returns*weights ) *252
std = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights))) * np.sqrt(252)
return std, returns
def random_portfolios(num_portfolios, mean_returns, cov_matrix, risk_free_rate):
results = np.zeros((3,num_portfolios))
print('results:',type(results))
weights_record = []
for i in range(num_portfolios):
weights = np.random.random(12)
weights /= np.sum(weights)
weights_record.append(weights)
portfolio_std_dev, portfolio_return = portfolio_annualised_performance(weights, mean_returns, cov_matrix)
results[0,i] = portfolio_std_dev
results[1,i] = portfolio_return
results[2,i] = (portfolio_return - risk_free_rate) / portfolio_std_dev
#print('results[2,0]:',type(results[2,0]))
#print('std', type(portfolio_std_dev))
#print(portfolio_return)
return results, weights_record
def display_simulated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate):
results, weights = random_portfolios(num_portfolios, mean_returns, cov_matrix, risk_free_rate)
max_sharpe_idx = np.argmax(np.array(results[2]))
sdp, rp = results[0,max_sharpe_idx], results[1,max_sharpe_idx]
max_sharpe_allocation = pd.DataFrame(weights[max_sharpe_idx],index=df.columns,columns=['allocation'])
max_sharpe_allocation.allocation = [round(i*100,2)for i in max_sharpe_allocation.allocation]
max_sharpe_allocation = max_sharpe_allocation.T
min_vol_idx = np.argmin(results[0])
sdp_min, rp_min = results[0,min_vol_idx], results[1,min_vol_idx]
min_vol_allocation = pd.DataFrame(weights[min_vol_idx],index=df.columns,columns=['allocation'])
min_vol_allocation.allocation = [round(i*100,2)for i in min_vol_allocation.allocation]
min_vol_allocation = min_vol_allocation.T
尝试运行时:
显示带有随机COV矩阵的模拟ef、平均回报、投资组合数量、无风险率
出现以下错误
----> 2 results, weights = random_portfolios(num_portfolios, mean_returns, cov_matrix, risk_free_rate)
---> 15 results[0,i] = portfolio_std_dev
ValueError: setting an array element with a sequence.
我做错了什么?如何解决这个问题?您调用函数时使用的参数顺序错误。交换前两个,效果良好:
display_simulated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate)
您使用的是什么版本的python、pandas和numpy?在将权重大小更改为weights=np.random.random2后,这对我来说没有错误,因为您只提供了2个资产我使用的是numpy 1.14.3和pandas 0.23.0。哦,很抱歉,在我的完整示例中,我有12项资产,如果有兴趣,发布完整代码的链接。@KenSyme我看到它对我也有效,问题出现在我意识到的最晚阶段,我将发布。与其告诉我们问题从何处开始,不如给我们一行不起作用的特定代码,我已经发布了整个错误,因为我相信它深深地存在于数据结构中,并且相信它不能仅仅通过更改一行代码来修复。但我可能像往常一样完全错了。我甚至没有想到这一点,感到羞愧,但你刚刚解决了我两天来遇到的一个问题。谢谢你在细节和时间上给予我关注!