Python 如何一次分配数据帧的每个元素?

Python 如何一次分配数据帧的每个元素?,python,pandas,dataframe,indexing,vectorization,Python,Pandas,Dataframe,Indexing,Vectorization,我有OG_df,它是: Symbol Order Shares Date 2011-01-10 AAPL BUY 1500 2011-01-13 AAPL SELL 1500 2011-01-13 IBM BUY 4000 2011-01-26 GOOG BUY 1000 2011-02-02 XOM SELL 4000 2011-02-10

我有
OG_df
,它是:

           Symbol Order  Shares
Date                           
2011-01-10   AAPL   BUY    1500
2011-01-13   AAPL  SELL    1500
2011-01-13    IBM   BUY    4000
2011-01-26   GOOG   BUY    1000
2011-02-02    XOM  SELL    4000
2011-02-10    XOM   BUY    4000
2011-03-03   GOOG  SELL    1000
2011-03-03   GOOG  SELL    2200
2011-05-03    IBM   BUY    1500
2011-06-03    IBM  SELL    3300
2011-06-10   AAPL   BUY    1200
2011-08-01   GOOG   BUY      55
2011-08-01   GOOG  SELL      55
2011-12-20   AAPL  SELL    1200
2011-12-21   AAPL   BUY      20
2011-12-27   GOOG   BUY    2200
2011-12-28    IBM  SELL    2200
          AAPL     IBM    GOOG    XOM     SPY  CASH
2011-01-10  340.99  143.41  614.21  72.02  123.19   1.0
2011-01-11  340.18  143.06  616.01  72.56  123.63   1.0
...            ...     ...     ...    ...     ...   ...
2011-11-15  387.17  186.44  616.56  77.62  124.10   1.0
2011-11-16  383.13  184.33  611.47  76.79  122.13   1.0
2011-11-17  375.80  183.45  600.87  76.41  120.19   1.0
2011-11-18  373.34  182.97  594.88  76.45  120.06   1.0
2011-11-21  367.43  179.26  580.94  75.48  117.78   1.0
2011-11-22  374.90  179.09  580.00  74.61  117.31   1.0
[245 rows x 6 columns]
[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 ..., 
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]
我还有
df_价格
,即:

           Symbol Order  Shares
Date                           
2011-01-10   AAPL   BUY    1500
2011-01-13   AAPL  SELL    1500
2011-01-13    IBM   BUY    4000
2011-01-26   GOOG   BUY    1000
2011-02-02    XOM  SELL    4000
2011-02-10    XOM   BUY    4000
2011-03-03   GOOG  SELL    1000
2011-03-03   GOOG  SELL    2200
2011-05-03    IBM   BUY    1500
2011-06-03    IBM  SELL    3300
2011-06-10   AAPL   BUY    1200
2011-08-01   GOOG   BUY      55
2011-08-01   GOOG  SELL      55
2011-12-20   AAPL  SELL    1200
2011-12-21   AAPL   BUY      20
2011-12-27   GOOG   BUY    2200
2011-12-28    IBM  SELL    2200
          AAPL     IBM    GOOG    XOM     SPY  CASH
2011-01-10  340.99  143.41  614.21  72.02  123.19   1.0
2011-01-11  340.18  143.06  616.01  72.56  123.63   1.0
...            ...     ...     ...    ...     ...   ...
2011-11-15  387.17  186.44  616.56  77.62  124.10   1.0
2011-11-16  383.13  184.33  611.47  76.79  122.13   1.0
2011-11-17  375.80  183.45  600.87  76.41  120.19   1.0
2011-11-18  373.34  182.97  594.88  76.45  120.06   1.0
2011-11-21  367.43  179.26  580.94  75.48  117.78   1.0
2011-11-22  374.90  179.09  580.00  74.61  117.31   1.0
[245 rows x 6 columns]
[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 ..., 
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]
我设置了
date\u range=pd.date\u range(OG\u df.index.min(),OG\u df.index.max())
,然后

df1 = pd.DataFrame(0, df_prices.index, columns=list(df_prices))
假设您有
vals=df1。值
,即:

           Symbol Order  Shares
Date                           
2011-01-10   AAPL   BUY    1500
2011-01-13   AAPL  SELL    1500
2011-01-13    IBM   BUY    4000
2011-01-26   GOOG   BUY    1000
2011-02-02    XOM  SELL    4000
2011-02-10    XOM   BUY    4000
2011-03-03   GOOG  SELL    1000
2011-03-03   GOOG  SELL    2200
2011-05-03    IBM   BUY    1500
2011-06-03    IBM  SELL    3300
2011-06-10   AAPL   BUY    1200
2011-08-01   GOOG   BUY      55
2011-08-01   GOOG  SELL      55
2011-12-20   AAPL  SELL    1200
2011-12-21   AAPL   BUY      20
2011-12-27   GOOG   BUY    2200
2011-12-28    IBM  SELL    2200
          AAPL     IBM    GOOG    XOM     SPY  CASH
2011-01-10  340.99  143.41  614.21  72.02  123.19   1.0
2011-01-11  340.18  143.06  616.01  72.56  123.63   1.0
...            ...     ...     ...    ...     ...   ...
2011-11-15  387.17  186.44  616.56  77.62  124.10   1.0
2011-11-16  383.13  184.33  611.47  76.79  122.13   1.0
2011-11-17  375.80  183.45  600.87  76.41  120.19   1.0
2011-11-18  373.34  182.97  594.88  76.45  120.06   1.0
2011-11-21  367.43  179.26  580.94  75.48  117.78   1.0
2011-11-22  374.90  179.09  580.00  74.61  117.31   1.0
[245 rows x 6 columns]
[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 ..., 
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]
形状
(245,6)

我也可以

cols = np.array([df1.columns.get_loc(c) for c in OG_df.Symbol])
cols
返回
[0 0 1 2 3 2 2 1 0 2 0 2 1]

OG_df.Symbol
['AAPL''IBM''GOOG''XOM']
,因此您可以看到,
OG_df
中有4个不同的列对应17个不同的行

我也有

rows = np.arange(len(df1))
我想做一些类似于
vals[rows,cols]=some_variable
的操作,但返回:

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (245,) (17,) 
因为
是长度
17
cols
是长度
245

我想根据
某个变量(每次都不同)填充
df1
中的每个单元格

我该怎么办

另外,我不想给
df1
CASH
分配
some_变量

示例输出:

              AAPL     IBM    GOOG    XOM     SPY  CASH
2011-01-10  1500        0       0.     0       0.   N/A
2011-01-11  0           0.      0.     0       0.   N/A
2011-01-12  0           0       0      0       0    N/A
2011-01-13  -1500       4000.   0.     0.      0.   N/A

我认为您正在尝试重新创建pivot_表,reindex。i、 e

df = OG_df.copy()

df['Shares'] = np.where(df['Order'] == 'BUY',df['Shares']*-1,df['Shares']) 

ndf = df.pivot_table(columns='Symbol',values='Shares',index='Date')\
       .reindex(date_range).fillna(0).assign(CASH=np.nan)
基于给定数据的样本输出

Symbol AAPL GOOG IBM XOM CASH 2011-01-10 -1500.0 0.0 0.0 0.0 NaN 2011-01-11 0.0 0.0 0.0 0.0 NaN 2011-01-12 0.0 0.0 0.0 0.0 NaN 2011-01-13 1500.0 0.0 -4000.0 0.0 NaN 2011-01-14 0.0 0.0 0.0 0.0 NaN 2011-01-15 0.0 0.0 0.0 0.0 NaN 2011-01-16 0.0 0.0 0.0 0.0 NaN 2011-01-17 0.0 0.0 0.0 0.0 NaN 2011-01-18 0.0 0.0 0.0 0.0 NaN 2011-01-19 0.0 0.0 0.0 0.0 NaN Symbol AAPL GOOG IBM XOM现金 2011年01月10日-1500.0.0.0 NaN 2011-01-11 0.0 0.0 0.0 NaN 2011-01-12 0.0 0.0 0.0 NaN 2011年01月13日1500.0.0-4000.0.0 NaN 2011-01-14 0.0 0.0 0.0 NaN 2011-01-15 0.0 0.0 0.0 NaN 2011-01-16 0.0 0.0 0.0 NaN 2011-01-17 0.0 0.0 0.0 NaN 2011-01-18 0.0 0.0 0.0 NaN 2011-01-19 0.0 0.0 0.0 NaN
如果在
OG_df
中出现
SPY
符号,将自动添加缺少的
SPY

df[:]=vals
?我得到了
ValueError:当使用iterable设置时,必须具有相等的len键和值。访问实际数据将非常有帮助。或者可能是一个.Ohk,所以你想把OG_df.Shares数据放在新的数据框中相应的列和索引下。对不起,还不清楚。。。您能否显示大约5行的预期输出?您缺少一种情况,即根据顺序,值可能为负值
order=np.where(orders\u df.order.values=='BUY',-1,1)
impact的价值是多少。我现在已经摆脱了它。现在没关系,我更新了解决方案。当然,最好将示例数据和预期输出与您尝试实现的逻辑位结合起来。如果我希望在
df_prices
ndf
中有相同数量的行,该怎么办?