Pandas 如何填充缺少的行
我有这样一个数据集:Pandas 如何填充缺少的行,pandas,dataframe,Pandas,Dataframe,我有这样一个数据集: Dept, Date, Number dept1, 2020-01-01, 12 dept1, 2020-01-03, 34 dept2, 2020-01-03, 56 dept3, 2020-01-03, 78 dept2, 2020-01-04, 11 dept3, 2020-01-04, 12 ... 例如,我想为2020-01-01日期缺失的dept2和dept3填写零 Dept, Date, Number dept1, 2020-01-01, 12 dept2,
Dept, Date, Number
dept1, 2020-01-01, 12
dept1, 2020-01-03, 34
dept2, 2020-01-03, 56
dept3, 2020-01-03, 78
dept2, 2020-01-04, 11
dept3, 2020-01-04, 12
...
例如,我想为2020-01-01日期缺失的dept2和dept3填写零
Dept, Date, Number
dept1, 2020-01-01, 12
dept2, 2020-01-01, 0 <--need to be added
dept3, 2020-01-01, 0 <--need to be added
dept1, 2020-01-03, 34
dept2, 2020-01-03, 56
dept3, 2020-01-03, 78
dept1, 2020-01-04, 0 <--need to be added
dept2, 2020-01-04, 11
dept3, 2020-01-04, 12
部门、日期、编号
部门,2020-01-01,12
Dept22020-01-01,0让我们做pivot
然后stack
out = df.pivot(*df.columns).fillna(0).stack().reset_index(name='Number')
Dept Date Number
0 dept1 2020-01-01 12.0
1 dept1 2020-01-03 34.0
2 dept1 2020-01-04 0.0
3 dept2 2020-01-01 0.0
4 dept2 2020-01-03 56.0
5 dept2 2020-01-04 11.0
6 dept3 2020-01-01 0.0
7 dept3 2020-01-03 78.0
8 dept3 2020-01-04 12.0
您可以使用函数from来抽象流程,只需传递要展开的列:
你也可以只关注熊猫,使用这种方法;涵盖索引不唯一或存在空值的情况;它是一个抽象/方便的包装:
(df
.set_index(['Dept', 'Date'])
.pipe(lambda df: df.reindex(pd.MultiIndex.from_product(df.index.levels),
fill_value = 0))
.reset_index()
)
Dept Date Number
0 dept1 2020-01-01 12
1 dept1 2020-01-03 34
2 dept1 2020-01-04 0
3 dept2 2020-01-01 0
4 dept2 2020-01-03 56
5 dept2 2020-01-04 11
6 dept3 2020-01-01 0
7 dept3 2020-01-03 78
8 dept3 2020-01-04 12
谢谢,如果我有更多的栏目,如Product1\u number、Product2\u number,…,怎么样。。。?看起来pivot可以使用1到4个位置参数。
(df
.set_index(['Dept', 'Date'])
.pipe(lambda df: df.reindex(pd.MultiIndex.from_product(df.index.levels),
fill_value = 0))
.reset_index()
)
Dept Date Number
0 dept1 2020-01-01 12
1 dept1 2020-01-03 34
2 dept1 2020-01-04 0
3 dept2 2020-01-01 0
4 dept2 2020-01-03 56
5 dept2 2020-01-04 11
6 dept3 2020-01-01 0
7 dept3 2020-01-03 78
8 dept3 2020-01-04 12