Python 在对数据透视表执行分析时,如何创建新的数据框?

Python 在对数据透视表执行分析时,如何创建新的数据框?,python,pandas,Python,Pandas,我正试图清理我的数据框,这样我就可以创建一个显示利润随时间变化的图表。我按照Symbol对数据帧进行分组,认为正确应用分析会更容易,但我不确定这里的逻辑。我想创建一个单独的数据框,如下面的示例所示 代码: import pandas as pd data = {'Action': ['BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPE

我正试图清理我的数据框,这样我就可以创建一个显示利润随时间变化的图表。我按照
Symbol
对数据帧进行分组,认为正确应用分析会更容易,但我不确定这里的逻辑。我想创建一个单独的数据框,如下面的示例所示

代码:

import pandas as pd

data = {'Action': ['BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE'],
        'Date': ['2020-07-23', '2020-07-29', '2020-06-04', '2020-06-24', '2020-07-17', '2020-07-21', '2020-05-28', '2020-05-28', '2020-06-29', '2020-07-20'],
        'Quantity': [200.0, 200.0, 130.0, 130.0, 100.0, 100.0, 100.0, 100.0, 120.0, 120.0],
        'Symbol': ['ACHV', 'ACHV', 'ACST', 'ACST', 'AGE', 'AGE', 'AIKI', 'AIKI', 'AIKI', 'AIKI'],
        'tot_value': [-96.16, 163.81, -100.2, 83.07, -100.08, 149.9, -74.08, 71.91, -100.9, 153.48]}

df_trade = pd.DataFrame(data)

df_trade.Date = pd.to_datetime(df_trade.Date)

df_trades = pd.pivot_table(df_trades, index=['Symbol', 'Date', 'Action', 'Quantity'], values=['tot_value'])

print(df_trades)
                                         tot_value
Symbol Date       Action        Quantity
ACHV   2020-07-23 BUY_TO_OPEN   200.0        -96.16
       2020-07-29 SELL_TO_CLOSE 200.0        163.81
ACST   2020-06-04 BUY_TO_OPEN   130.0       -100.20
       2020-06-24 SELL_TO_CLOSE 130.0         83.07
AGE    2020-07-17 BUY_TO_OPEN   100.0       -100.08
       2020-07-21 SELL_TO_CLOSE 100.0        149.90
AIKI   2020-05-28 BUY_TO_OPEN   100.0        -74.08
                  SELL_TO_CLOSE 100.0         71.91
       2020-06-29 BUY_TO_OPEN   120.0       -100.90
       2020-07-20 SELL_TO_CLOSE 120.0        153.48
数据透视表:

import pandas as pd

data = {'Action': ['BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE'],
        'Date': ['2020-07-23', '2020-07-29', '2020-06-04', '2020-06-24', '2020-07-17', '2020-07-21', '2020-05-28', '2020-05-28', '2020-06-29', '2020-07-20'],
        'Quantity': [200.0, 200.0, 130.0, 130.0, 100.0, 100.0, 100.0, 100.0, 120.0, 120.0],
        'Symbol': ['ACHV', 'ACHV', 'ACST', 'ACST', 'AGE', 'AGE', 'AIKI', 'AIKI', 'AIKI', 'AIKI'],
        'tot_value': [-96.16, 163.81, -100.2, 83.07, -100.08, 149.9, -74.08, 71.91, -100.9, 153.48]}

df_trade = pd.DataFrame(data)

df_trade.Date = pd.to_datetime(df_trade.Date)

df_trades = pd.pivot_table(df_trades, index=['Symbol', 'Date', 'Action', 'Quantity'], values=['tot_value'])

print(df_trades)
                                         tot_value
Symbol Date       Action        Quantity
ACHV   2020-07-23 BUY_TO_OPEN   200.0        -96.16
       2020-07-29 SELL_TO_CLOSE 200.0        163.81
ACST   2020-06-04 BUY_TO_OPEN   130.0       -100.20
       2020-06-24 SELL_TO_CLOSE 130.0         83.07
AGE    2020-07-17 BUY_TO_OPEN   100.0       -100.08
       2020-07-21 SELL_TO_CLOSE 100.0        149.90
AIKI   2020-05-28 BUY_TO_OPEN   100.0        -74.08
                  SELL_TO_CLOSE 100.0         71.91
       2020-06-29 BUY_TO_OPEN   120.0       -100.90
       2020-07-20 SELL_TO_CLOSE 120.0        153.48
我试图实现的数据帧示例:

import pandas as pd

data = {'Action': ['BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE', 'BUY_TO_OPEN', 'SELL_TO_CLOSE'],
        'Date': ['2020-07-23', '2020-07-29', '2020-06-04', '2020-06-24', '2020-07-17', '2020-07-21', '2020-05-28', '2020-05-28', '2020-06-29', '2020-07-20'],
        'Quantity': [200.0, 200.0, 130.0, 130.0, 100.0, 100.0, 100.0, 100.0, 120.0, 120.0],
        'Symbol': ['ACHV', 'ACHV', 'ACST', 'ACST', 'AGE', 'AGE', 'AIKI', 'AIKI', 'AIKI', 'AIKI'],
        'tot_value': [-96.16, 163.81, -100.2, 83.07, -100.08, 149.9, -74.08, 71.91, -100.9, 153.48]}

df_trade = pd.DataFrame(data)

df_trade.Date = pd.to_datetime(df_trade.Date)

df_trades = pd.pivot_table(df_trades, index=['Symbol', 'Date', 'Action', 'Quantity'], values=['tot_value'])

print(df_trades)
                                         tot_value
Symbol Date       Action        Quantity
ACHV   2020-07-23 BUY_TO_OPEN   200.0        -96.16
       2020-07-29 SELL_TO_CLOSE 200.0        163.81
ACST   2020-06-04 BUY_TO_OPEN   130.0       -100.20
       2020-06-24 SELL_TO_CLOSE 130.0         83.07
AGE    2020-07-17 BUY_TO_OPEN   100.0       -100.08
       2020-07-21 SELL_TO_CLOSE 100.0        149.90
AIKI   2020-05-28 BUY_TO_OPEN   100.0        -74.08
                  SELL_TO_CLOSE 100.0         71.91
       2020-06-29 BUY_TO_OPEN   120.0       -100.90
       2020-07-20 SELL_TO_CLOSE 120.0        153.48
其中日期为
SELL\u TO\u CLOSE
日期,
Profit
列为
tot\u值
的总和,使用每个
BUY\u TO\u OPEN
SELL\u TO\u CLOSE
对进行计算

Symbol Date          Profit
ACHV   2020-07-29    67.65
ACST   2020-06-24   -17.13
AGE    2020-07-21    49.82
AIKI   2020-05-28    -2.17
AIKI   2020-07-20    52.58  

在数据透视表中,您可以使用数据是按顺序和成对排列的事实,通过将操作分组为“购买”到“打开”和
cumsum

print (df_trades.reset_index()
                .groupby((df_trades.index.get_level_values('Action')
                          =='BUY_TO_OPEN').cumsum())
                .agg({'Symbol':'first', 'Date':'last','tot_value':'sum'})
                .rename(columns={'tot_value':'Profit'})
      )
  Symbol        Date  Profit
1   ACHV  2020-07-29   67.65
2   ACST  2020-06-24  -17.13
3    AGE  2020-07-21   49.82
4   AIKI  2020-05-28   -2.17
5   AIKI  2020-07-20   52.58

您是否尝试过在
df_trades
dataframe上使用[groupby]()?