Python 如何在groupby之后设置聚合?
鉴于我有如下数据集:Python 如何在groupby之后设置聚合?,python,pandas,group-by,Python,Pandas,Group By,鉴于我有如下数据集: dt = { "facility":["Ann Arbor","Ann Arbor","Detriot","Detriot","Detriot"], "patient_ID":[4388,4388,9086,9086,9086], "year":[2004,2007,2007,2008,2011], "month":[8,9,9,6,2], "Nr_Small":[0,0,5,12,10], "Nr_Medium":[3,1,
dt = {
"facility":["Ann Arbor","Ann Arbor","Detriot","Detriot","Detriot"],
"patient_ID":[4388,4388,9086,9086,9086],
"year":[2004,2007,2007,2008,2011],
"month":[8,9,9,6,2],
"Nr_Small":[0,0,5,12,10],
"Nr_Medium":[3,1,1,4,3],
"Nr_Large":[2,0,0,0,0],
"PeriodBetween2Visits" : [10,0,12,3,1],
"NumberOfVisits" : [2,2,3,3,3]
}
dt = pd.DataFrame(dt)
我需要保留按患者ID分组
,然后保留设施
,患者ID
,就诊次数
,但是两次就诊之间的周期的最大和最小
以下是我尝试过的:
dt = dt.groupby(['patient_ID'],as_index=False)["facility","patient_ID","PeriodBetween2Visits","NumberOfVisits"].agg({'PeriodBetween2Visits': ['min', 'max']})
dt.head()
但是,这不是我需要的
我的正确输出如下:
dt = {
"facility":["Ann Arbor","Ann Arbor","Detriot","Detriot","Detriot"],
"patient_ID":[4388,4388,9086,9086,9086],
"year":[2004,2007,2007,2008,2011],
"month":[8,9,9,6,2],
"Nr_Small":[0,0,5,12,10],
"Nr_Medium":[3,1,1,4,3],
"Nr_Large":[2,0,0,0,0],
"PeriodBetween2Visits" : [10,0,12,3,1],
"NumberOfVisits" : [2,2,3,3,3]
}
dt = pd.DataFrame(dt)
我在这里使用的是内置于groupby和agg中的命名聚合:
你也可以发布你期望的数据帧吗?Thanks@anky_91请找到我的question@Jeff可能是版本问题。是否可以更新熊猫版本?您目前正在使用哪一个<代码>pd.\uuuuu版本\uuuuu
@Jeff我使用的是版本0.25+
,如果可能,请提供版本和try@Jeff你到底想为该列聚合什么,如何在示例中得到2和3?@Jeff我理解,但由于我们正在减少行数,我们必须对其余列有一些逻辑,例如第一行,max行AVG等,你不能有所有行正确吗?是的,我们可以考虑第一行值为<代码>编号Buffels:/Cord>
facility patient_ID Min_PeriodBetween2Visits Max_PeriodBetween2Visits \
0 Ann Arbor 4388 0 10
1 Detriot 9086 1 12
NumberOfVisits
0 2
1 3