Python pandas获取分组df中列的频率
我有一个看起来像这样的数据框:Python pandas获取分组df中列的频率,python,pandas,Python,Pandas,我有一个看起来像这样的数据框: import pandas as pd df = pd.DataFrame({'Name' : ['A', 'A', 'B','C','C','C','D','D'], 'ID' : ['1', '1', '2','3','3','4','4','4'], 'duration' : ['600', '3000', '3000', '600', '3000', '3000', '600'
import pandas as pd
df = pd.DataFrame({'Name' : ['A', 'A', 'B','C','C','C','D','D'],
'ID' : ['1', '1', '2','3','3','4','4','4'],
'duration' : ['600', '3000', '3000', '600', '3000', '3000', '600','3000']})
我想得到这样的东西:
Name ID 600 3000
A 1 1 1
B 2 0 1
C 3 1 1
C 4 0 1
D 4 1 1
我尝试使用groupby put,但似乎缺少一个步骤您可以使用
pd.crosstab
来执行此操作:
counts = pd.crosstab(index=[df["Name"], df["ID"]], columns=df["duration"])
# Remove the name of the column array. It throws some people off to look at
counts = counts.rename_axis(columns=None).reset_index()
print(counts)
3000 600
Name ID
A 1 1 1
B 2 1 0
C 3 1 1
4 1 0
D 4 1 1
您还可以使用pivot\u table
作为另一种方法:
counts = df.pivot_table(
index=["Name", "ID"], columns=["duration"], aggfunc="size", fill_value=0
)
counts = counts.rename_axis(columns=None)
print(counts)
3000 600
Name ID
A 1 1 1
B 2 1 0
C 3 1 1
4 1 0
D 4 1 1
您可以使用
pd.crosstab
执行此操作:
counts = pd.crosstab(index=[df["Name"], df["ID"]], columns=df["duration"])
# Remove the name of the column array. It throws some people off to look at
counts = counts.rename_axis(columns=None).reset_index()
print(counts)
3000 600
Name ID
A 1 1 1
B 2 1 0
C 3 1 1
4 1 0
D 4 1 1
您还可以使用pivot\u table
作为另一种方法:
counts = df.pivot_table(
index=["Name", "ID"], columns=["duration"], aggfunc="size", fill_value=0
)
counts = counts.rename_axis(columns=None)
print(counts)
3000 600
Name ID
A 1 1 1
B 2 1 0
C 3 1 1
4 1 0
D 4 1 1
我尝试根据名称和ID对数据进行分组,然后我想通过计算是否出现600和或3000来简化数据帧,并将其汇总到一个分组中df@G.Anderson谢谢你指出这个答案!很高兴知道,我尝试根据名称和ID对数据进行分组,然后我想通过计算是否出现600和或3000来简化数据帧,并将其汇总到一个分组中df@G.Anderson谢谢你指出这个答案!很高兴知道这是一个完美的作品,非常感谢!非常好用,非常感谢!