Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/317.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 具有timeseries的bin计数的Pandas groupby_Python_Pandas_Pandas Groupby_Binning - Fatal编程技术网

Python 具有timeseries的bin计数的Pandas groupby

Python 具有timeseries的bin计数的Pandas groupby,python,pandas,pandas-groupby,binning,Python,Pandas,Pandas Groupby,Binning,在示例数据帧上 data = pd.DataFrame(np.random.rand(6,2), columns = list('ab')) dti = pd.date_range(start='2019-02-12', end='2019-02-12', periods=6) data.set_index(dti, inplace=True) 收益率: a b 2019-02-12 00:00:00 0.909822

在示例数据帧上

data = pd.DataFrame(np.random.rand(6,2), columns = list('ab'))
dti = pd.date_range(start='2019-02-12', end='2019-02-12', periods=6)
data.set_index(dti, inplace=True)
收益率:

                            a         b
2019-02-12 00:00:00  0.909822  0.548713
2019-02-12 01:00:00  0.295730  0.452881
2019-02-12 02:00:00  0.889976  0.042893
2019-02-12 03:00:00  0.466465  0.971178
2019-02-12 04:00:00  0.532618  0.769210
2019-02-12 05:00:00  0.947362  0.021689
现在,如何在两列上混合使用分组和装箱功能? 假设我有
bin=[0,0.2,0.4,0.6,0.8,1]
,我如何才能在
a
列上获取
data
数据,并在
b
列上获得
平均值(或max,min,sum等)
b
每天、每周、每月的每个bin

与或一起使用,并聚合
min
max
平均值
sum

bins = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
labels = ['{}-{}'.format(i + 1, j) for i, j in zip(bins[:-1], bins[1:])] 

s = pd.cut(data['a'], bins=bins, labels=labels)

df = data.groupby([data.index.day.rename('day'), s])['b'].min().reset_index()

#df = data.groupby([data.index.week.rename('week'), s])['b'].min().reset_index()
#df = data.groupby([data.index.month.rename('month'), s])['b'].min().reset_index()
print (df)
   day        a         b
0   12  1.4-0.6  0.267070
1   12  1.6-0.8  0.637877
2   12  1.8-1.0  0.299172
也可以通过传递多个函数

或使用:

与或一起使用,并聚合
min
max
mean
sum

bins = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
labels = ['{}-{}'.format(i + 1, j) for i, j in zip(bins[:-1], bins[1:])] 

s = pd.cut(data['a'], bins=bins, labels=labels)

df = data.groupby([data.index.day.rename('day'), s])['b'].min().reset_index()

#df = data.groupby([data.index.week.rename('week'), s])['b'].min().reset_index()
#df = data.groupby([data.index.month.rename('month'), s])['b'].min().reset_index()
print (df)
   day        a         b
0   12  1.4-0.6  0.267070
1   12  1.6-0.8  0.637877
2   12  1.8-1.0  0.299172
也可以通过传递多个函数

或使用:

df3 = (data.groupby([data.index.day.rename('day'), s])['b']
           .describe()
           .reset_index())
print (df3)
   day        a  count      mean       std       min       25%       50%  \
0   12  1.4-0.6    1.0  0.267070       NaN  0.267070  0.267070  0.267070   
1   12  1.6-0.8    2.0  0.770542  0.187616  0.637877  0.704210  0.770542   
2   12  1.8-1.0    3.0  0.366001  0.058221  0.299172  0.346126  0.393081   

        75%       max  
0  0.267070  0.267070  
1  0.836874  0.903206  
2  0.399415  0.405750