Python pandas获取分组df中列的频率_Python_Pandas

Python pandas获取分组df中列的频率

python pandas

Python pandas获取分组df中列的频率,python,pandas,Python,Pandas,我有一个看起来像这样的数据框： import pandas as pd df = pd.DataFrame({'Name' : ['A', 'A', 'B','C','C','C','D','D'], 'ID' : ['1', '1', '2','3','3','4','4','4'], 'duration' : ['600', '3000', '3000', '600', '3000', '3000', '600'

我有一个看起来像这样的数据框：

import pandas as pd


df = pd.DataFrame({'Name' : ['A', 'A', 'B','C','C','C','D','D'],
                   'ID' : ['1', '1', '2','3','3','4','4','4'],
                   'duration' : ['600', '3000', '3000', '600', '3000', '3000', '600','3000']})

我想得到这样的东西：

Name ID 600 3000
 A    1  1   1
 B    2  0   1
 C    3  1   1
 C    4  0   1
 D    4  1   1

我尝试使用groupby put，但似乎缺少一个步骤

您可以使用

pd.crosstab

来执行此操作：

counts = pd.crosstab(index=[df["Name"], df["ID"]], columns=df["duration"])

# Remove the name of the column array. It throws some people off to look at
counts = counts.rename_axis(columns=None).reset_index()

print(counts)
         3000  600
Name ID           
A    1      1    1
B    2      1    0
C    3      1    1
     4      1    0
D    4      1    1

您还可以使用

pivot\u table

作为另一种方法：

counts = df.pivot_table(
    index=["Name", "ID"], columns=["duration"], aggfunc="size", fill_value=0
)

counts = counts.rename_axis(columns=None)

print(counts)
         3000  600
Name ID           
A    1      1    1
B    2      1    0
C    3      1    1
     4      1    0
D    4      1    1

您可以使用

pd.crosstab

执行此操作：

counts = pd.crosstab(index=[df["Name"], df["ID"]], columns=df["duration"])

# Remove the name of the column array. It throws some people off to look at
counts = counts.rename_axis(columns=None).reset_index()

print(counts)
         3000  600
Name ID           
A    1      1    1
B    2      1    0
C    3      1    1
     4      1    0
D    4      1    1

您还可以使用

pivot\u table

作为另一种方法：

counts = df.pivot_table(
    index=["Name", "ID"], columns=["duration"], aggfunc="size", fill_value=0
)

counts = counts.rename_axis(columns=None)

print(counts)
         3000  600
Name ID           
A    1      1    1
B    2      1    0
C    3      1    1
     4      1    0
D    4      1    1

我尝试根据名称和ID对数据进行分组，然后我想通过计算是否出现600和或3000来简化数据帧，并将其汇总到一个分组中df@G.Anderson谢谢你指出这个答案！很高兴知道，我尝试根据名称和ID对数据进行分组，然后我想通过计算是否出现600和或3000来简化数据帧，并将其汇总到一个分组中df@G.Anderson谢谢你指出这个答案！很高兴知道这是一个完美的作品，非常感谢！非常好用，非常感谢！