Python 通过两个特定列求两个变量的和并计算商
我有一个数据帧Python 通过两个特定列求两个变量的和并计算商,python,excel,pandas,dataframe,Python,Excel,Pandas,Dataframe,我有一个数据帧df1: Plant name Brand Region Units produced capacity Cost incurred Gujarat Plant Hyundai Asia 8500 9250 18500000 Haryana Plant Honda Asia 10000 10750 21500000 Chen
df1
:
Plant name Brand Region Units produced capacity Cost incurred
Gujarat Plant Hyundai Asia 8500 9250 18500000
Haryana Plant Honda Asia 10000 10750 21500000
Chennai Plant Hyundai Asia 12000 12750 25500000
Zurich Plant Volkswagen Europe 25000 25750 77250000
Chennai Plant Suzuki Asia 6000 6750 13500000
Rengensburg BMW Europe 12500 13250 92750000
Dingolfing Mercedes Europe 14000 14750 103250000
我需要以下格式的输出数据帧:
df2= Region BMW Mercedes Volkswagen Toyota Suzuki Honda Hyundai
Europe
North America
Asia
Oceania
其中,对于特定的地区
和品牌
,每个单元格的内容等于总和(发生的成本)/总和(生产的单位)
我尝试过的代码,导致ValueError:
for i,j in itertools.zip_longest(range(len(df2),range(len(df2.columns)):
if (df2.index[i] in list(df1["Region"]) & df2.columns[j] in list(df1["Brand"])==True:
temp1 = df1["Region"]==df2.index[i]
temp2 = df1["Brand"]==df2.columns[j]]
df2.loc[df2.index[i],df2.columns[j]] = df1(temp1&temp2)["Cost incurred"].sum()/
df1(temp1&temp2)["Units Produced"].sum()
elif (df2.index[i] in list(df1["Region"]) & df2.columns[j] in list(df1["Brand"])==False:
df2.loc[df2.index[i],df2.columns[j]] = 0
ValueError:包含多个元素的数组的真值为
模棱两可的。使用a.any()或a.all()
设计用于枢轴和聚合功能。快速(?)和肮脏的解决方案:
df1.pivot_table(index="Region", columns="Brand", values="Cost incurred", aggfunc=np.sum)\
/ df1.pivot_table(index="Region", columns="Brand", values="Units produced", aggfunc=np.sum)
输出
Brand BMW Honda Hyundai Mercedes Suzuki Volkswagen
Region
Asia NaN 2150.0 2146.341463 NaN 2250.0 NaN
Europe 7420.0 NaN NaN 7375.0 NaN 3090.0