Python 基于固定周期或可用数据的滚动统计

Python 基于固定周期或可用数据的滚动统计,python,pandas,rolling-computation,Python,Pandas,Rolling Computation,我想在一个固定的时间段内进行滚动统计,比如说5天 DATE Price ID AAPL US Equity 2015-01-02 109.33 AAPL US Equity 2015-01-05 106.25 AAPL US Equity 2015-01-06 106.26 AAPL US Equity 2015-01-07 107.75 AAPL US Equity 2015-01-08 111.89 AAP

我想在一个固定的时间段内进行滚动统计,比如说5天

                 DATE       Price
ID          
AAPL US Equity  2015-01-02  109.33
AAPL US Equity  2015-01-05  106.25
AAPL US Equity  2015-01-06  106.26
AAPL US Equity  2015-01-07  107.75
AAPL US Equity  2015-01-08  111.89
AAPL US Equity  2015-01-09  112.01
AAPL US Equity  2015-01-12  109.25
AAPL US Equity  2015-01-13  110.22
AAPL US Equity  2015-01-14  109.80
AAPL US Equity  2015-01-15  106.82

给予

                DATE        Price   Average
ID                  
AAPL US Equity  2015-01-02  109.33  NaN
AAPL US Equity  2015-01-05  106.25  NaN
AAPL US Equity  2015-01-06  106.26  NaN
AAPL US Equity  2015-01-07  107.75  NaN
AAPL US Equity  2015-01-08  111.89  108.296
AAPL US Equity  2015-01-09  112.01  108.832
AAPL US Equity  2015-01-12  109.25  109.432
AAPL US Equity  2015-01-13  110.22  110.224
AAPL US Equity  2015-01-14  109.80  110.634
AAPL US Equity  2015-01-15  106.82  109.620

如何修改以应用任何函数来获取固定期间内的任何滚动统计数据,但对于可用数据的前几行,即第一行平均值将是第一天价格,前两行平均值将是前两天价格等

我知道我可以使用iterrows在这个“平均”值的情况下实现这一点,但理想情况下,我希望将其用于任何统计数据,如分位数、std等

差不多

df['Average']=my_rolling_stat(df['Price'],period=5,function='mean')
df['Stdev']=my_rolling_stat(df['Price'],period=10,function='std')
df['95_Perc']=my_rolling_stat(df['Price'],period=10,function='quantile',quantile_value=0.95)

提前感谢

IIUC,在
滚动中使用
最小周期
参数

df['Average']=df['Price'].rolling(5, min_periods=1).mean()
输出:

0    109.3300
1    107.7900
2    107.2800
3    107.3975
4    108.2960
5    108.8320
6    109.4320
7    110.2240
8    110.6340
9    109.6200
Name: PRICE, dtype: float64

IIUC,使用
rolling
中的
min\u periods
参数:

df['Average']=df['Price'].rolling(5, min_periods=1).mean()
输出:

0    109.3300
1    107.7900
2    107.2800
3    107.3975
4    108.2960
5    108.8320
6    109.4320
7    110.2240
8    110.6340
9    109.6200
Name: PRICE, dtype: float64

为了补充Scott Boston的答案,您可以将滚动统计函数定义为:

def my_rolling_stat(series, period, function, **kwargs):
    window = series.rolling(period, min_periods=1)
    func = getattr(window, function)

    return func(**kwargs)
用法:

my_rolling_stat(df['Price'], period=5, function='mean')
my_rolling_stat(df['Price'], period=10, function='std')
my_rolling_stat(df['Price'], period=10, function='quantile', quantile=0.95)

可以从列表中找到函数及其参数列表。

要添加到Scott Boston的答案中,您可以将滚动统计函数定义为:

def my_rolling_stat(series, period, function, **kwargs):
    window = series.rolling(period, min_periods=1)
    func = getattr(window, function)

    return func(**kwargs)
用法:

my_rolling_stat(df['Price'], period=5, function='mean')
my_rolling_stat(df['Price'], period=10, function='std')
my_rolling_stat(df['Price'], period=10, function='quantile', quantile=0.95)
可以从列表中找到函数及其参数的列表