Python 在df之间添加行以完成系列中缺少的值
我有一个df,从1-12个月列出每个月的成本,几个月没有任何成本,我希望以0的成本完成一系列的月。最好的方法是什么? 输入: 期望输出:Python 在df之间添加行以完成系列中缺少的值,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个df,从1-12个月列出每个月的成本,几个月没有任何成本,我希望以0的成本完成一系列的月。最好的方法是什么? 输入: 期望输出: Section | Maintenance | Month | Group | Costs ---------|-------------|-------|-------|------- A2 | Painting | 1 | 0 | 0 A2 | Painting | 2 |
Section | Maintenance | Month | Group | Costs
---------|-------------|-------|-------|-------
A2 | Painting | 1 | 0 | 0
A2 | Painting | 2 | 0 | 0
A2 | Painting | 3 | 0 | 2000
A2 | Painting | 4 | 0 | 3500
A2 | Painting | 5 | 0 | 1000
A2 | Painting | 6 | 0 | 0
A2 | Painting | 7 | 0 | 2500
A2 | Painting | 8 | 0 | 1500
A2 | Painting | 9 | 0 | 3000
A2 | Painting | 10 | 0 | 2000
A2 | Painting | 11 | 0 | 2000
A2 | Painting | 12 | 0 | 1000
A2 | Painting | 1 | 1 | 0
A2 | Painting | 2 | 1 | 0
A2 | Painting | 3 | 1 | 4000
A2 | Painting | 4 | 1 | 5000
A2 | Painting | 5 | 1 | 0
A2 | Painting | 6 | 1 | 0
A2 | Painting | 7 | 1 | 0
A2 | Painting | 8 | 1 | 0
A2 | Painting | 9 | 1 | 0
A2 | Painting | 10 | 1 | 0
A2 | Painting | 11 | 1 | 0
A2 | Painting | 12 | 1 | 0
A3 | Painting | 1 | 0 | 0
A3 | Painting | 2 | 0 | 3000
A3 | Painting | 3 | 0 | 0
A3 | Painting | 4 | 0 | 0
A3 | Painting | 5 | 0 | 0
A3 | Painting | 6 | 0 | 0
A3 | Painting | 7 | 0 | 0
A3 | Painting | 8 | 0 | 0
A3 | Painting | 9 | 0 | 0
A3 | Painting | 10 | 0 | 0
A3 | Painting | 11 | 0 | 0
A3 | Painting | 12 | 0 | 0
编辑:潜入错误的维护类型,扩展输入/输出示例使用列和范围的唯一值持续数月,但每组:
def f(x):
mux = (pd.MultiIndex.from_product([x['Section'].unique(),
x['Maintenance'].unique(),
range(1, 13),
x['Group'].unique()],
names=['Section','Maintenance','Month','Group']))
return x.set_index(['Section','Maintenance','Month', 'Group']).reindex(mux, fill_value=0)
df3 = df.groupby(['Section','Maintenance','Group'], group_keys=False).apply(f).reset_index()
您的结果包含大量重复行。这种方法不是将所有可能的属性的所有4列相互组合,而不仅仅是df中显示的属性吗?我的2万行df产生了150万行df。有没有办法只填写原始df中组合的月份?组号可以针对不同的区段或维护类型重复。对于每个发生的分区/维护/组对,我每个月需要12行。我编辑了我的问题,也许这样更清楚:)我认为你的解决方案几乎是正确的,我只需要它,而不需要创建最后一行36-47的冗余,因为在输入中没有一对:{Section:A3,Maintenance:Painting,Group:1}组为0的只有一对。我想以后删除它们很简单,但可能效率很低。是的!你做到了!非常感谢,我只是快速看了一眼,但它看起来正是我所需要的!
def f(x):
mux = (pd.MultiIndex.from_product([x['Section'].unique(),
x['Maintenance'].unique(),
range(1, 13),
x['Group'].unique()],
names=['Section','Maintenance','Month','Group']))
return x.set_index(['Section','Maintenance','Month', 'Group']).reindex(mux, fill_value=0)
df3 = df.groupby(['Section','Maintenance','Group'], group_keys=False).apply(f).reset_index()
print (df3)
Section Maintenance Month Group Costs
0 A2 Painting 1 0 0
1 A2 Painting 2 0 0
2 A2 Painting 3 0 2000
3 A2 Painting 4 0 3500
4 A2 Painting 5 0 1000
5 A2 Painting 6 0 0
6 A2 Painting 7 0 2500
7 A2 Painting 8 0 1500
8 A2 Painting 9 0 3000
9 A2 Painting 10 0 2000
10 A2 Painting 11 0 2000
11 A2 Painting 12 0 1000
12 A2 Painting 1 1 0
13 A2 Painting 2 1 0
14 A2 Painting 3 1 4000
15 A2 Painting 4 1 5000
16 A2 Painting 5 1 0
17 A2 Painting 6 1 2000
18 A2 Painting 7 1 1500
19 A2 Painting 8 1 4000
20 A2 Painting 9 1 0
21 A2 Painting 10 1 3500
22 A2 Painting 11 1 0
23 A2 Painting 12 1 6000
24 A3 Painting 1 0 0
25 A3 Painting 2 0 3000
26 A3 Painting 3 0 0
27 A3 Painting 4 0 0
28 A3 Painting 5 0 0
29 A3 Painting 6 0 0
30 A3 Painting 7 0 0
31 A3 Painting 8 0 0
32 A3 Painting 9 0 0
33 A3 Painting 10 0 0
34 A3 Painting 11 0 0
35 A3 Painting 12 0 0