Python dataframe:对具有相同第一个字符的列进行分组
我有一个如下所示的数据帧Python dataframe:对具有相同第一个字符的列进行分组,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有一个如下所示的数据帧 LIT__0001 LIT__002 AAA__0001 AAA__0002 XYZ 2019-10-31 13:40:00-04:00 NaN 0.014786 10 55 1 2019-10-31 13:45:00-04:00 NaN 0.012143 33 11 2 2
LIT__0001 LIT__002 AAA__0001 AAA__0002 XYZ
2019-10-31 13:40:00-04:00 NaN 0.014786 10 55 1
2019-10-31 13:45:00-04:00 NaN 0.012143 33 11 2
2019-10-31 13:50:00-04:00 NaN NaN NaN NaN 3
2019-10-31 13:55:00-04:00 NaN 0.020000 14 13 4
2019-10-31 14:00:00-04:00 0.010000 NaN 14 NaN 5
LIT AAA XYZ
2019-10-31 13:40:00-04:00 0.014786 10 1
2019-10-31 13:45:00-04:00 0.012143 11 2
2019-10-31 13:50:00-04:00 NaN NaN 3
2019-10-31 13:55:00-04:00 0.020000 13 4
2019-10-31 14:00:00-04:00 0.010000 14 5
我需要将其转换为如下所示的数据帧
LIT__0001 LIT__002 AAA__0001 AAA__0002 XYZ
2019-10-31 13:40:00-04:00 NaN 0.014786 10 55 1
2019-10-31 13:45:00-04:00 NaN 0.012143 33 11 2
2019-10-31 13:50:00-04:00 NaN NaN NaN NaN 3
2019-10-31 13:55:00-04:00 NaN 0.020000 14 13 4
2019-10-31 14:00:00-04:00 0.010000 NaN 14 NaN 5
LIT AAA XYZ
2019-10-31 13:40:00-04:00 0.014786 10 1
2019-10-31 13:45:00-04:00 0.012143 11 2
2019-10-31 13:50:00-04:00 NaN NaN 3
2019-10-31 13:55:00-04:00 0.020000 13 4
2019-10-31 14:00:00-04:00 0.010000 14 5
也就是说,对于在“_u”之前有公共第一个字符的每一列,为每一行取最小值
我的数据帧非常大,因此我非常欣赏更快的解决方案。用于轴=1的列和用于拆分的lambda函数:
df = df.groupby(lambda x: x.split('__')[0], axis=1, sort=False).min()
或使用: