Python 按组填写熊猫中缺失的日期
我需要把遗漏的日期按组填下来。下面是创建数据帧的代码。我只想添加填充列的日期,直到填充列的日期更改为止,直到组“名称”更改为止Python 按组填写熊猫中缺失的日期,python,pandas,Python,Pandas,我需要把遗漏的日期按组填下来。下面是创建数据帧的代码。我只想添加填充列的日期,直到填充列的日期更改为止,直到组“名称”更改为止 data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831, 20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501],
data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831,
20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501],
'name': ['A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B'],
'fill': [NaN,NaN,20080915,NaN,NaN,NaN,NaN,NaN,NaN,20081023,
NaN,NaN,NaN,NaN,20090219,NaN,NaN]}
df = pd.DataFrame(data, columns=['tdate', 'name', 'fill'])
df
当前数据帧
tdate name fill
0 20080815 A NaN
1 20080915 A NaN
2 20081226 A 20080915
3 20090110 A NaN
4 20090131 A NaN
5 20080807 B NaN
6 20080831 B NaN
7 20080918 B NaN
8 20081023 B NaN
9 20081114 B 20081023
10 20081207 B NaN
11 20090117 B NaN
12 20090203 B NaN
13 20090219 B NaN
14 20090305 B 20090219
15 20090318 B NaN
16 20090501 B NaN
期望输出
tdate name fill
0 20080815 A NaN
1 20080915 A NaN
2 20081226 A 20080915
3 20090110 A 20080915
4 20090131 A 20080915
5 20080807 B NaN
6 20080831 B NaN
7 20080918 B NaN
8 20081023 B NaN
9 20081114 B NaN
10 20081207 B 20081023
11 20090117 B 20081023
12 20090203 B 20081023
13 20090219 B 20081023
14 20090305 B 20081023
15 20090318 B 20090219
16 20090501 B 20090219
这是我的密码
df.groupby(df["name"])["fill"].fill()
您非常接近,您只需要向前填充,而不仅仅是填充:
df.groupby('name')["fill"].ffill()
Out[42]:
0 NaN
1 NaN
2 20080915
3 20080915
4 20080915
5 NaN
6 NaN
7 NaN
8 NaN
9 20081023
10 20081023
11 20081023
12 20081023
13 20081023
14 20090219
15 20090219
16 20090219
dtype: float64
或相当于:
df.groupby('name')["fill"].fillna(method='ffill')