Pandas 折叠数据帧的重复行

Pandas 折叠数据帧的重复行,pandas,python-3.8,Pandas,Python 3.8,我有数据帧df,如下所示: Col1 Col2 Col3 StartDate EndDate Qty 24HR A1 B1 1/1/2020 1/31/2020 4.2 24HR A1 B1 2/1/2020 2/29/2020 11 asd A2 B2 2/1/2020 2/29/2020 35 asd A2 B2 3/1/20

我有数据帧
df
,如下所示:

Col1    Col2    Col3    StartDate   EndDate     Qty
24HR    A1      B1      1/1/2020    1/31/2020   4.2
24HR    A1      B1      2/1/2020    2/29/2020   11
asd     A2      B2      2/1/2020    2/29/2020   35
asd     A2      B2      3/1/2020    3/31/2020   23
asd     A2      B2      4/1/2020    4/30/2020   35
asd     A2      B2      5/1/2020    5/31/2020   46
Col1    Col2    Col3    StartDate   EndDate     Jan  Feb    Mar  Apr    May
24HR    A1      B1      1/1/2020    2/29/2020   4.2  11         
asd     A2      B2      2/1/2020    5/31/2020        35     23    35    46
df['MnthName'] = df['StartDate'].dt.strftime('%b')
df = df.pivot_table(index=['Col1', 'Col2', 'Col3'], values='Qty', columns='MnthName')
我需要根据
Col1、Col2、Col3
中的重复来折叠行,以获得以下内容:

Col1    Col2    Col3    StartDate   EndDate     Qty
24HR    A1      B1      1/1/2020    1/31/2020   4.2
24HR    A1      B1      2/1/2020    2/29/2020   11
asd     A2      B2      2/1/2020    2/29/2020   35
asd     A2      B2      3/1/2020    3/31/2020   23
asd     A2      B2      4/1/2020    4/30/2020   35
asd     A2      B2      5/1/2020    5/31/2020   46
Col1    Col2    Col3    StartDate   EndDate     Jan  Feb    Mar  Apr    May
24HR    A1      B1      1/1/2020    2/29/2020   4.2  11         
asd     A2      B2      2/1/2020    5/31/2020        35     23    35    46
df['MnthName'] = df['StartDate'].dt.strftime('%b')
df = df.pivot_table(index=['Col1', 'Col2', 'Col3'], values='Qty', columns='MnthName')
上面的
StartDate
EndDate
是所有列的最小值和最大值。i、 e.对于值为
24小时、A1、B1
的列,最小
StartDate
1/1/2020
,最大
EndDate
2/29/2020

我尝试了以下方法:

Col1    Col2    Col3    StartDate   EndDate     Qty
24HR    A1      B1      1/1/2020    1/31/2020   4.2
24HR    A1      B1      2/1/2020    2/29/2020   11
asd     A2      B2      2/1/2020    2/29/2020   35
asd     A2      B2      3/1/2020    3/31/2020   23
asd     A2      B2      4/1/2020    4/30/2020   35
asd     A2      B2      5/1/2020    5/31/2020   46
Col1    Col2    Col3    StartDate   EndDate     Jan  Feb    Mar  Apr    May
24HR    A1      B1      1/1/2020    2/29/2020   4.2  11         
asd     A2      B2      2/1/2020    5/31/2020        35     23    35    46
df['MnthName'] = df['StartDate'].dt.strftime('%b')
df = df.pivot_table(index=['Col1', 'Col2', 'Col3'], values='Qty', columns='MnthName')

但我不知道如何将其分组,以便为
Col1、Col2、Col3
唯一对中的每一对选择
StartDate
的最小值和
EndDate的最大值。

我们可以将
pivot
agg
然后
concat
将它们组合在一起

s1=df.pivot_table(index=['Col1','Col2','Col3'],columns='StartDate',values='Qty')

s2=df.groupby(['Col1','Col2','Col3']).agg({'StartDate':'first','EndDate':'last'})
s1.columns=pd.to_datetime(s1.columns,dayfirst=False).strftime('%b')
s=pd.concat([s2,s1],axis=1).reset_index()
s
   Col1 Col2 Col3 StartDate    EndDate  Jan   Feb   Mar   Apr   May
0  24HR   A1   B1  1/1/2020  2/28/2020  4.2  11.0   NaN   NaN   NaN
1   asd   A2   B2  2/1/2020  5/31/2020  NaN  35.0  23.0  35.0  46.0