如何在python中获取groupby结果的所有单独列中唯一值的计数_Python_Pandas

如何在python中获取groupby结果的所有单独列中唯一值的计数

python pandas

如何在python中获取groupby结果的所有单独列中唯一值的计数,python,pandas,Python,Pandas,我正在使用Pandas dataframe，希望在数据帧的两列上获得groupby输出的各个列中的唯一值计数我的输入数据帧是： id number name time method level 121 567 XYZ 24 run 150 234 679 ABC 56 floor 120 121 567 XYZ 26 walk 150 578 865 EFG 89

我正在使用Pandas dataframe，希望在数据帧的两列上获得groupby输出的各个列中的唯一值计数

我的输入数据帧是：

id  number  name    time    method  level
121 567     XYZ     24      run     150
234 679     ABC     56      floor   120
121 567     XYZ     26      walk    150
578 865     EFG     89      fly     430
965 685     MNO     40      cry     278
578 865     MNO     67      fly     430

所需输出

id  number  name    time    method  level
121 567     1       2       2       1
234 679     1       1       1       1
578 865     2       2       1       1
965 685     1       1       1       1

因此，我希望在输出中为每个groupby（[“id”，“number”）]结果指定唯一元素的数量。

您可以在每个系列上使用

groupby应用，然后使用应用来仅计算唯一值：
df.groupby(['id','number'])['name', 'time', 'method', 'level']\
    .apply(lambda x: x.apply(lambda y: y.drop_duplicates().count()))\
    .reset_index([0,1])

# Output:

    id  number  name  time  method  level
0  121     567     1     2       2      1
1  234     679     1     1       1      1
2  578     865     2     2       1      1
3  965     685     1     1       1      1

我希望这会有所帮助。
您可以在每个系列上使用groupby apply
，然后使用apply
，只计算唯一值：
df.groupby(['id','number'])['name', 'time', 'method', 'level']\
    .apply(lambda x: x.apply(lambda y: y.drop_duplicates().count()))\
    .reset_index([0,1])

# Output:

    id  number  name  time  method  level
0  121     567     1     2       2      1
1  234     679     1     1       1      1
2  578     865     2     2       1      1
3  965     685     1     1       1      1

我希望这能有所帮助。
您可以使用groupby.agg
和nunique
：
df.groupby(['id', 'number']).agg(pd.Series.nunique)
Out: 
            name  time  method  level
id  number                           
121 567        1     2       2      1
234 679        1     1       1      1
578 865        2     2       1      1
965 685        1     1       1      1

您可以将groupby.agg
与nunique
一起使用：
df.groupby(['id', 'number']).agg(pd.Series.nunique)
Out: 
            name  time  method  level
id  number                           
121 567        1     2       2      1
234 679        1     1       1      1
578 865        2     2       1      1
965 685        1     1       1      1