Python 如何一次分配数据帧的每个元素?
我有Python 如何一次分配数据帧的每个元素?,python,pandas,dataframe,indexing,vectorization,Python,Pandas,Dataframe,Indexing,Vectorization,我有OG_df,它是: Symbol Order Shares Date 2011-01-10 AAPL BUY 1500 2011-01-13 AAPL SELL 1500 2011-01-13 IBM BUY 4000 2011-01-26 GOOG BUY 1000 2011-02-02 XOM SELL 4000 2011-02-10
OG_df
,它是:
Symbol Order Shares
Date
2011-01-10 AAPL BUY 1500
2011-01-13 AAPL SELL 1500
2011-01-13 IBM BUY 4000
2011-01-26 GOOG BUY 1000
2011-02-02 XOM SELL 4000
2011-02-10 XOM BUY 4000
2011-03-03 GOOG SELL 1000
2011-03-03 GOOG SELL 2200
2011-05-03 IBM BUY 1500
2011-06-03 IBM SELL 3300
2011-06-10 AAPL BUY 1200
2011-08-01 GOOG BUY 55
2011-08-01 GOOG SELL 55
2011-12-20 AAPL SELL 1200
2011-12-21 AAPL BUY 20
2011-12-27 GOOG BUY 2200
2011-12-28 IBM SELL 2200
AAPL IBM GOOG XOM SPY CASH
2011-01-10 340.99 143.41 614.21 72.02 123.19 1.0
2011-01-11 340.18 143.06 616.01 72.56 123.63 1.0
... ... ... ... ... ... ...
2011-11-15 387.17 186.44 616.56 77.62 124.10 1.0
2011-11-16 383.13 184.33 611.47 76.79 122.13 1.0
2011-11-17 375.80 183.45 600.87 76.41 120.19 1.0
2011-11-18 373.34 182.97 594.88 76.45 120.06 1.0
2011-11-21 367.43 179.26 580.94 75.48 117.78 1.0
2011-11-22 374.90 179.09 580.00 74.61 117.31 1.0
[245 rows x 6 columns]
[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
...,
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]]
我还有df_价格
,即:
Symbol Order Shares
Date
2011-01-10 AAPL BUY 1500
2011-01-13 AAPL SELL 1500
2011-01-13 IBM BUY 4000
2011-01-26 GOOG BUY 1000
2011-02-02 XOM SELL 4000
2011-02-10 XOM BUY 4000
2011-03-03 GOOG SELL 1000
2011-03-03 GOOG SELL 2200
2011-05-03 IBM BUY 1500
2011-06-03 IBM SELL 3300
2011-06-10 AAPL BUY 1200
2011-08-01 GOOG BUY 55
2011-08-01 GOOG SELL 55
2011-12-20 AAPL SELL 1200
2011-12-21 AAPL BUY 20
2011-12-27 GOOG BUY 2200
2011-12-28 IBM SELL 2200
AAPL IBM GOOG XOM SPY CASH
2011-01-10 340.99 143.41 614.21 72.02 123.19 1.0
2011-01-11 340.18 143.06 616.01 72.56 123.63 1.0
... ... ... ... ... ... ...
2011-11-15 387.17 186.44 616.56 77.62 124.10 1.0
2011-11-16 383.13 184.33 611.47 76.79 122.13 1.0
2011-11-17 375.80 183.45 600.87 76.41 120.19 1.0
2011-11-18 373.34 182.97 594.88 76.45 120.06 1.0
2011-11-21 367.43 179.26 580.94 75.48 117.78 1.0
2011-11-22 374.90 179.09 580.00 74.61 117.31 1.0
[245 rows x 6 columns]
[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
...,
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]]
我设置了date\u range=pd.date\u range(OG\u df.index.min(),OG\u df.index.max())
,然后
df1 = pd.DataFrame(0, df_prices.index, columns=list(df_prices))
假设您有vals=df1。值,即:
Symbol Order Shares
Date
2011-01-10 AAPL BUY 1500
2011-01-13 AAPL SELL 1500
2011-01-13 IBM BUY 4000
2011-01-26 GOOG BUY 1000
2011-02-02 XOM SELL 4000
2011-02-10 XOM BUY 4000
2011-03-03 GOOG SELL 1000
2011-03-03 GOOG SELL 2200
2011-05-03 IBM BUY 1500
2011-06-03 IBM SELL 3300
2011-06-10 AAPL BUY 1200
2011-08-01 GOOG BUY 55
2011-08-01 GOOG SELL 55
2011-12-20 AAPL SELL 1200
2011-12-21 AAPL BUY 20
2011-12-27 GOOG BUY 2200
2011-12-28 IBM SELL 2200
AAPL IBM GOOG XOM SPY CASH
2011-01-10 340.99 143.41 614.21 72.02 123.19 1.0
2011-01-11 340.18 143.06 616.01 72.56 123.63 1.0
... ... ... ... ... ... ...
2011-11-15 387.17 186.44 616.56 77.62 124.10 1.0
2011-11-16 383.13 184.33 611.47 76.79 122.13 1.0
2011-11-17 375.80 183.45 600.87 76.41 120.19 1.0
2011-11-18 373.34 182.97 594.88 76.45 120.06 1.0
2011-11-21 367.43 179.26 580.94 75.48 117.78 1.0
2011-11-22 374.90 179.09 580.00 74.61 117.31 1.0
[245 rows x 6 columns]
[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
...,
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]]
形状(245,6)
我也可以
cols = np.array([df1.columns.get_loc(c) for c in OG_df.Symbol])
cols
返回[0 0 1 2 3 2 2 1 0 2 0 2 1]
OG_df.Symbol
是['AAPL''IBM''GOOG''XOM']
,因此您可以看到,OG_df
中有4个不同的列对应17个不同的行
我也有
rows = np.arange(len(df1))
我想做一些类似于vals[rows,cols]=some_variable
的操作,但返回:
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (245,) (17,)
因为行
是长度17
而cols
是长度245
我想根据某个变量(每次都不同)填充df1
中的每个单元格
我该怎么办
另外,我不想给df1
的CASH
分配some_变量
示例输出:
AAPL IBM GOOG XOM SPY CASH
2011-01-10 1500 0 0. 0 0. N/A
2011-01-11 0 0. 0. 0 0. N/A
2011-01-12 0 0 0 0 0 N/A
2011-01-13 -1500 4000. 0. 0. 0. N/A
我认为您正在尝试重新创建pivot_表,reindex。i、 e
df = OG_df.copy()
df['Shares'] = np.where(df['Order'] == 'BUY',df['Shares']*-1,df['Shares'])
ndf = df.pivot_table(columns='Symbol',values='Shares',index='Date')\
.reindex(date_range).fillna(0).assign(CASH=np.nan)
基于给定数据的样本输出
Symbol AAPL GOOG IBM XOM CASH
2011-01-10 -1500.0 0.0 0.0 0.0 NaN
2011-01-11 0.0 0.0 0.0 0.0 NaN
2011-01-12 0.0 0.0 0.0 0.0 NaN
2011-01-13 1500.0 0.0 -4000.0 0.0 NaN
2011-01-14 0.0 0.0 0.0 0.0 NaN
2011-01-15 0.0 0.0 0.0 0.0 NaN
2011-01-16 0.0 0.0 0.0 0.0 NaN
2011-01-17 0.0 0.0 0.0 0.0 NaN
2011-01-18 0.0 0.0 0.0 0.0 NaN
2011-01-19 0.0 0.0 0.0 0.0 NaN
Symbol AAPL GOOG IBM XOM现金
2011年01月10日-1500.0.0.0 NaN
2011-01-11 0.0 0.0 0.0 NaN
2011-01-12 0.0 0.0 0.0 NaN
2011年01月13日1500.0.0-4000.0.0 NaN
2011-01-14 0.0 0.0 0.0 NaN
2011-01-15 0.0 0.0 0.0 NaN
2011-01-16 0.0 0.0 0.0 NaN
2011-01-17 0.0 0.0 0.0 NaN
2011-01-18 0.0 0.0 0.0 NaN
2011-01-19 0.0 0.0 0.0 NaN
如果在OG_df
中出现SPY
符号,将自动添加缺少的SPY
列 df[:]=vals
?我得到了ValueError:当使用iterable设置时,必须具有相等的len键和值。访问实际数据将非常有帮助。或者可能是一个.Ohk,所以你想把OG_df.Shares数据放在新的数据框中相应的列和索引下。对不起,还不清楚。。。您能否显示大约5行的预期输出?您缺少一种情况,即根据顺序,值可能为负值order=np.where(orders\u df.order.values=='BUY',-1,1)
impact的价值是多少。我现在已经摆脱了它。现在没关系,我更新了解决方案。当然,最好将示例数据和预期输出与您尝试实现的逻辑位结合起来。如果我希望在df_prices
和ndf
中有相同数量的行,该怎么办?