Python Pandas聚合:将唯一对象的总和放入新的DF
我希望提取列中唯一的非数字元素(例如位置),并将它们与各自的数字元素(例如薪水)之和组合在数据框中 给定某个文件,例如此文件:Python Pandas聚合:将唯一对象的总和放入新的DF,python,pandas,Python,Pandas,我希望提取列中唯一的非数字元素(例如位置),并将它们与各自的数字元素(例如薪水)之和组合在数据框中 给定某个文件,例如此文件: Position,Experience in Years,Salary Middle Management,5,5000 Lower Management,2,3000 Upper Management,1,7000 Middle Management,5,5000 Middle Management,7,7000 Upper Management,10,12000 L
Position,Experience in Years,Salary
Middle Management,5,5000
Lower Management,2,3000
Upper Management,1,7000
Middle Management,5,5000
Middle Management,7,7000
Upper Management,10,12000
Lower Management,2,2000
Middle Management,5,5001
Upper Management,7,10000
我定义了以下函数,该函数将filepath作为输入,它可以工作,但它将总和与错误的位置配对-有人知道如何解决这个问题吗
def Aggregation(filePath):
dataframe = pd.read_csv(filePath)
uniques = dataframe['Position'].unique() #returns
sum_list = []
for i in range(0,len(uniques)):
iteration_sum = dataframe.groupby('Position')['Salary'].sum()[i]
sum_list.append(iteration_sum)
new_df = pd.DataFrame(uniques)
column_values = pd.Series(sum_list)
new_df.insert(loc=1, column='Sum', value=column_values)
return(sum_list)
Aggregation(filePath)
输出基本上应该是如下所示的数据帧
Position, Sum of Salary
Lower Management, 5000
Middle Management, 22001
Upper Management, 29000
我认为您的函数应该简化很多,只为将
系列
转换为列名称为工资总额的新数据框而添加:
def Aggregation(filePath):
dataframe = pd.read_csv(filePath)
return dataframe.groupby('Position')['Salary'].sum().reset_index(name='Sum of Salary')