Python 在数据框中按日期创建价格之间移动的摘要_Python_Pandas

Python 在数据框中按日期创建价格之间移动的摘要

python pandas

Python 在数据框中按日期创建价格之间移动的摘要,python,pandas,Python,Pandas,我有一个数据框显示；1）日期、价格和3）按行计算两种价格之间的差异 dates | data | result | change 24-09 24 0 none 25-09 26 2 pos 26-09 27 1 pos 27-09 28 1 pos 28-09 26 -2 neg 我想在新的数据框中创建上述

我有一个数据框显示；1）日期、价格和3）按行计算两种价格之间的差异

dates | data | result     | change
24-09    24      0           none
25-09    26      2           pos
26-09    27      1           pos
27-09    28      1           pos
28-09    26     -2           neg

我想在新的数据框中创建上述数据的摘要。摘要将有4列：1）开始日期，2）结束日期3）运行天数4）运行

例如，使用上面的公式，从25-09到27-09有一个+4的正运行，所以我希望在数据帧的一行中这样做：

在新的数据帧中，result的值从正值到负值的每一次更改都会有一个新行。如果run=0，则表示与前几天的价格没有变化，并且还需要在数据帧中有自己的行

start date | end date | num days | run 
 25-09        27-09        3        4         
 27-09        28-09        1        -2
 23-09        24-09        1        0

我认为第一步是根据run的值创建一个新的列“change”，然后显示“正”、“负”或“无更改”。然后，也许我可以根据本专栏进行分组。

针对这种类型的问题，有两个有用的函数是diff（）和cumsum（）

我在示例数据中添加了一些额外的数据点，以充实功能

能够拾取和选择分配给不同列的不同（以及多个）聚合函数是pandas的一个超级功能

df = pd.DataFrame({'dates': ['24-09', '25-09', '26-09', '27-09', '28-09', '29-09', '30-09','01-10','02-10','03-10','04-10'],
                    'data': [24, 26, 27, 28, 26,25,30,30,30,28,25],
                    'result': [0,2,1,1,-2,0,5,0,0,-2,-3]})

def cat(x):
    return 1 if  x > 0 else -1 if x < 0 else 0

df['cat'] =  df['result'].map(lambda x : cat(x)) # probably there is a better way to do this

df['change'] = df['cat'].diff()  
df['change_flag'] = df['change'].map(lambda x: 1 if x != 0 else x)
df['change_cum_sum'] = df['change_flag'].cumsum() # which gives us our groupings


foo = df.groupby(['change_cum_sum']).agg({'result' : np.sum,'dates' : [np.min,np.max,'count'] })
foo.reset_index(inplace=True)
foo.columns = ['id','start date','end date','num days','run' ]
print foo

   id start date end date  num days  run
0   1      24-09    24-09         1    0
1   2      25-09    27-09         3    4
2   3      28-09    28-09         1   -2
3   4      29-09    29-09         1    0
4   5      30-09    30-09         1    5
5   6      01-10    02-10         2    0
6   7      03-10    04-10         2   -5