Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/284.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 最后一次观察一年后向前填充柱_Python_Pandas_Dataframe_Datetime_Resampling - Fatal编程技术网

Python 最后一次观察一年后向前填充柱

Python 最后一次观察一年后向前填充柱,python,pandas,dataframe,datetime,resampling,Python,Pandas,Dataframe,Datetime,Resampling,我使用以下df转发填充值: df = (df.resample('d') # ensure data is daily time series .ffill() .sort_index(ascending=True)) df在向前填充之前 id a b c d datadate 1980-01-31

我使用以下df转发填充值:

df = (df.resample('d') # ensure data is daily time series
 .ffill()
 .sort_index(ascending=True)) 
df在向前填充之前

id                 a          b          c          d
datadate                                              
1980-01-31        NaN        NaN        NaN        NaN
1980-02-29        NaN         2         NaN        NaN
1980-03-31        NaN        NaN        NaN        NaN
1980-04-30         1         NaN         3          4
1980-05-31        NaN        NaN        NaN        NaN
              ...        ...        ...        ...
2019-08-31        NaN        NaN        NaN        NaN
2019-09-30        NaN        NaN        NaN        NaN
2019-10-31        NaN        NaN        NaN        NaN
2019-11-30        NaN        NaN        NaN        NaN
2019-12-31        NaN        NaN        20         33
然而,我只希望在最后一次观察(日期为datetime)后一年进行向前填充,然后剩下的行将被简单地替换为NaN。我不确定在这项任务中引入这一标准的最佳方式是什么。任何帮助都会很棒


谢谢

如果我理解正确,您希望将2019年12月31日的值向前填充到下一年。试试这个:

end_date = df.index.max()
new_end_date = end_date + pd.offsets.DateOffset(years=1)
new_index = df.index.append(pd.date_range(end_date, new_end_date, closed='right'))

df = df.reindex(new_index)
df.loc[end_date:, :] = df.loc[end_date:, :].ffill()
结果:

              a    b     c     d
1980-01-31  NaN  NaN   NaN   NaN
1980-02-29  NaN  2.0   NaN   NaN
1980-03-31  NaN  NaN   NaN   NaN
1980-04-30  1.0  NaN   3.0   4.0
1980-05-31  NaN  NaN   NaN   NaN
2019-08-31  NaN  NaN   NaN   NaN
2019-09-30  NaN  NaN   NaN   NaN
2019-10-31  NaN  NaN   NaN   NaN
2019-11-30  NaN  NaN   NaN   NaN
2019-12-31  NaN  NaN  20.0  33.0
2020-01-01  NaN  NaN  20.0  33.0
2020-01-02  NaN  NaN  20.0  33.0
...
2020-12-31  NaN  NaN  20.0  33.0

一种解决方案是使用限制参数进行正向填充,但这不会处理闰年:

df.fillna(mehotd='ffill', limit=365)
第二种解决方案是定义一个更稳健的函数,在1年窗口内进行正向填充:

from pandas.tseries.offsets import DateOffsets

def fun(serie_df):
    serie = serie_df.copy()
    indexes = serie[~serie.isnull()].index

    for idx in indexes:
        mask = (serie.index >= idx) & (serie.index < idx+DateOffset(years=1))
        serie.loc[mask] =  serie[mask].fillna(method='ffill')
    return serie

df_filled = df.apply(fun, axis=0)
从pandas.tseries.offset导入日期偏移量
def fun(意甲联赛df):
serie=serie_df.copy()
index=serie[~serie.isnull()].index
对于索引中的idx:
掩码=(系列索引>=idx)和(系列索引
如果一列在同一个1年窗口中有多个非nan值,则在遇到最新值后,第一次填充将停止。第二种解决方案将连续值视为独立值。

我不确定在本任务中引入此标准的最佳方法是什么。请确保:-)