Python 如何在Pandas中分组并保留所有列
我有这样一个数据框:Python 如何在Pandas中分组并保留所有列,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我有这样一个数据框: year drug_name avg_number_of_ingredients 0 2019 NEXIUM I.V. 8 1 2016 ZOLADEX 10 2 2017 PRILOSEC 59 3 2017 BYDUREON BCise
year drug_name avg_number_of_ingredients
0 2019 NEXIUM I.V. 8
1 2016 ZOLADEX 10
2 2017 PRILOSEC 59
3 2017 BYDUREON BCise 24
4 2019 Lynparza 28
year drug_name avg_number_of_ingredients
0 2019 drug a,b,c.. mean value for column
1 2018 drug a,b,c.. mean value for column
2 2017 drug a,b,c.. mean value for column
我需要按年份对药物名称和成分的平均数量进行分组,如下所示:
year drug_name avg_number_of_ingredients
0 2019 NEXIUM I.V. 8
1 2016 ZOLADEX 10
2 2017 PRILOSEC 59
3 2017 BYDUREON BCise 24
4 2019 Lynparza 28
year drug_name avg_number_of_ingredients
0 2019 drug a,b,c.. mean value for column
1 2018 drug a,b,c.. mean value for column
2 2017 drug a,b,c.. mean value for column
如果我做了
df.groupby('year')
,我会丢失药物名称。我该怎么做呢?让我向您展示这个简单示例的解决方案。首先,我制作了与您相同的数据帧:
>>> df = pd.DataFrame(
[
{'year': 2019, 'drug_name': 'NEXIUM I.V.', 'avg_number_of_ingredients': 8},
{'year': 2016, 'drug_name': 'ZOLADEX', 'avg_number_of_ingredients': 10},
{'year': 2017, 'drug_name': 'PRILOSEC', 'avg_number_of_ingredients': 59},
{'year': 2017, 'drug_name': 'BYDUREON BCise', 'avg_number_of_ingredients': 24},
{'year': 2019, 'drug_name': 'Lynparza', 'avg_number_of_ingredients': 28},
]
)
>>> print(df)
year drug_name avg_number_of_ingredients
0 2019 NEXIUM I.V. 8
1 2016 ZOLADEX 10
2 2017 PRILOSEC 59
3 2017 BYDUREON BCise 24
4 2019 Lynparza 28
现在,我制作了一个df_grouped
,它仍然包含关于药物名称的信息
>>> df_grouped = df.groupby('year', as_index=False).agg({'drug_name': ', '.join, 'avg_number_of_ingredients': 'mean'})
>>> print(df_grouped)
year drug_name avg_number_of_ingredients
0 2016 ZOLADEX 10.0
1 2017 PRILOSEC, BYDUREON BCise 41.5
2 2019 NEXIUM I.V., Lynparza 18.0
df.groupby('year',as_index=False).agg({'drugh_name':','join,'avg_number_of_components':'mean'})
?