Python 按组填写熊猫中缺失的日期

Python 按组填写熊猫中缺失的日期,python,pandas,Python,Pandas,我需要把遗漏的日期按组填下来。下面是创建数据帧的代码。我只想添加填充列的日期,直到填充列的日期更改为止,直到组“名称”更改为止 data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831, 20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501],

我需要把遗漏的日期按组填下来。下面是创建数据帧的代码。我只想添加填充列的日期,直到填充列的日期更改为止,直到组“名称”更改为止

    data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831,
    20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501],
        'name': ['A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B'],
    'fill': [NaN,NaN,20080915,NaN,NaN,NaN,NaN,NaN,NaN,20081023,
             NaN,NaN,NaN,NaN,20090219,NaN,NaN]}

    df = pd.DataFrame(data, columns=['tdate', 'name', 'fill'])
    df
当前数据帧

tdate   name    fill
0    20080815    A   NaN
1    20080915    A   NaN
2    20081226    A   20080915
3    20090110    A   NaN
4    20090131    A   NaN
5    20080807    B   NaN
6    20080831    B   NaN
7    20080918    B   NaN
8    20081023    B   NaN
9    20081114    B   20081023
10   20081207    B   NaN
11   20090117    B   NaN
12   20090203    B   NaN
13   20090219    B   NaN
14   20090305    B   20090219
15   20090318    B   NaN
16   20090501    B   NaN
期望输出

    tdate   name    fill
0    20080815    A   NaN
1    20080915    A   NaN
2    20081226    A   20080915
3    20090110    A   20080915
4    20090131    A   20080915
5    20080807    B   NaN
6    20080831    B   NaN
7    20080918    B   NaN
8    20081023    B   NaN
9    20081114    B   NaN
10   20081207    B   20081023
11   20090117    B   20081023
12   20090203    B   20081023
13   20090219    B   20081023
14   20090305    B   20081023
15   20090318    B   20090219
16   20090501    B   20090219
这是我的密码

df.groupby(df["name"])["fill"].fill()

您非常接近,您只需要向前填充,而不仅仅是填充:

df.groupby('name')["fill"].ffill()
Out[42]: 
0          NaN
1          NaN
2     20080915
3     20080915
4     20080915
5          NaN
6          NaN
7          NaN
8          NaN
9     20081023
10    20081023
11    20081023
12    20081023
13    20081023
14    20090219
15    20090219
16    20090219
dtype: float64
或相当于:

df.groupby('name')["fill"].fillna(method='ffill')