Python 基于多列和一列之和对数据帧重新排序
我有福勒。数据帧:Python 基于多列和一列之和对数据帧重新排序,python,pandas,Python,Pandas,我有福勒。数据帧: Country_FAO type mean_area 0 Afghanistan car 2029000.0 1 Afghanistan car 112000.0 2 Algeria bus 827000.0 3 Algeri
Country_FAO type mean_area
0 Afghanistan car 2029000.0
1 Afghanistan car 112000.0
2 Algeria bus 827000.0
3 Algeria bus 2351.0
4 Australia car 6475695.0
5 Australia car 12141000.0
6 Australia bus 293806.0
我想根据Country\u FAO
列中每个值的mean\u area
之和对该数据框重新排序。最终结果应如下所示:
Country_FAO type mean_area
0 Australia car 12141000.0
1 Australia car 6475695.0
2 Australia bus 293806.0
3 Afghanistan car 2029000.0
4 Afghanistan car 112000.0
5 Algeria bus 827000.0
6 Algeria bus 2351.0
澳大利亚排名第一,因为其三个类别的平均面积
值之和最高
我试过这个:
df_stacked.sort(['Country_FAO', 'mean_area'], ascending=[False, False])
但这不起作用,在进行排序之前,它不会将所有的
平均值区域
相加 我认为您需要创建新的列sort
by with and then。最后,您可以使用:
谢谢@jezrael,有没有办法让“排序”列成为管道的一部分?非常难的问题,我从未见过。谢谢!请看一个相关的查询:
df['sort'] = df.groupby('Country_FAO')['mean_area'].transform(sum)
df['sort'] = df.groupby('Country_FAO')['mean_area'].transform(sum)
df1 = df.sort_values(['sort','Country_FAO', 'mean_area'], ascending=False)
print df1
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
2 Algeria bus 827000.0 829351.0
3 Algeria bus 2351.0 829351.0
df1 = df1.drop('sort', axis=1).reset_index(drop=True)
print df1
Country_FAO type mean_area
0 Australia car 12141000.0
1 Australia car 6475695.0
2 Australia bus 293806.0
3 Afghanistan car 2029000.0
4 Afghanistan car 112000.0
5 Algeria bus 827000.0
6 Algeria bus 2351.0