Python 按列分组,并获得0的频率
我有一个数据帧,我想按Col1 Col2 Col3分组,并获得值列的0频率: df= 我如何应用groupby来实现Python 按列分组,并获得0的频率,python,pandas,group-by,Python,Pandas,Group By,我有一个数据帧,我想按Col1 Col2 Col3分组,并获得值列的0频率: df= 我如何应用groupby来实现 Col1 Col2 Col3 Fercentage_of_0 Val1 Val2 A 0.2 Val1 Val2 B 0.8 ... 谢谢大家! 一个简单的lambda函数可以为您完成此任务。生成一个列表,其中Value==0,获取此列表的len和组中项目的len。你有百分比吗 df = pd.DataFrame({"Col1":
Col1 Col2 Col3 Fercentage_of_0
Val1 Val2 A 0.2
Val1 Val2 B 0.8
...
谢谢大家! 一个简单的
lambda
函数可以为您完成此任务。生成一个列表,其中Value==0
,获取此列表的len和组中项目的len。你有百分比吗
df = pd.DataFrame({"Col1":["Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1"],"Col2":["Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2"],"Col3":["A","A","A","A","A","B","B","B","B","B"],"Value":[0,1,2,0,1,0,0,0,0,1]})
df.groupby(["Col1","Col2","Col3"]).\
agg({"Value":lambda x: len([v for v in x if v==0])/len(x)})
输出
Value
Col1 Col2 Col3
Val1 Val2 A 0.4
B 0.8
对数据帧使用groupby,然后对生成的数据帧应用size()方法。 例如,假设您有一个名为df的createda数据帧,其中包含这些值
df = pd.DataFrame({'Col1': ['Val1','Val1','Val1','Val1','Val1','Val1','Val1','Val1'],
'Col2': ['Val2','Val2','Val2','Val2','Val2','Val2','Val2','Val2'],
'Col3': ['A','A','A','A','B','B','B','B'],
'Value':[0,1,2,0,0,0,0,1]})
然后,可以使用
df.groupby(['Col1','Col2','Col3','Value']).size()
Col1 Col2 Col3 Value
Val1 Val2 A 0 2
1 1
2 1
B 0 3
1 1
dtype: int64
这里有另一种不使用lambda的方法,这对我来说似乎更容易理解:
df['is_zero'] = df['Value'] == 0
df.groupby(['Col1', 'Col2', 'Col3'])['is_zero'].mean()
为
Value
创建一个等于0的布尔列,并在Col
列上创建groupby
(
df.assign(Percentage_Of_0=lambda x: x.Value.eq(0))
.groupby(["Col1", "Col2", "Col3"], as_index=False)
.Percentage_Of_0.mean()
)
Col1 Col2 Col3 Percentage_Of_0
0 Val1 Val2 A 0.4
1 Val1 Val2 B 0.8
df['Value'].eq(0).groupby([df['Col1'],df['Col2'],df['Col3']])。mean()
?@QuangHoang谢谢!你从哪里学来的?
(
df.assign(Percentage_Of_0=lambda x: x.Value.eq(0))
.groupby(["Col1", "Col2", "Col3"], as_index=False)
.Percentage_Of_0.mean()
)
Col1 Col2 Col3 Percentage_Of_0
0 Val1 Val2 A 0.4
1 Val1 Val2 B 0.8