Python Pandas GroupBy-仅显示具有多个唯一特征值的组_Python_Pandas_Compare_Unique_Pandas Groupby

Python Pandas GroupBy-仅显示具有多个唯一特征值的组

python pandas

Python Pandas GroupBy-仅显示具有多个唯一特征值的组,python,pandas,compare,unique,pandas-groupby,Python,Pandas,Compare,Unique,Pandas Groupby,我有一个数据框df_things，看起来像这样，我想在训练之前预测分类的质量 A B C CLASS ----------------------- al1 bal1 cal1 Ship al1 bal1 cal1 Ship al1 bal2 cal2 Ship al2 bal2 cal2 Cow al3 bal3 cal3 Car al1 bal2 cal3 Car al3 bal3 cal3 Car 我想按类对

我有一个数据框

df_things

，看起来像这样，我想在训练之前预测分类的质量

A    B     C      CLASS
-----------------------
al1  bal1  cal1   Ship
al1  bal1  cal1   Ship
al1  bal2  cal2   Ship
al2  bal2  cal2   Cow
al3  bal3  cal3   Car
al1  bal2  cal3   Car
al3  bal3  cal3   Car

我想按类对行进行分组，以便了解功能的分布情况。我是这样做的（例如，在列“B”）

这给了我结果

CLASS  B 
-------------
ship   bal1  2 
       bal2  1
cow    bal2  2
car    bal2  1
       bal3  2

我想要的是只可视化具有多个值的组，因此它看起来如下所示：

CLASS  B 
-------------
ship   bal1  2 
       bal2  1
car    bal2  1
       bal3  2

我有点卡住了，有什么想法吗？

您可以使用

groupby

来筛选

nunique

计数超过1的组

v = df_things.groupby('CLASS').B.value_counts()
v[v.groupby(level=0).transform('nunique').gt(1)]

CLASS  B   
Car    bal3    2
       bal2    1
Ship   bal1    2
       bal2    1
Name: B, dtype: int64

来自交叉表的解决方案

s=pd.crosstab(df.CLASS,df.B)
s[s.ne(0).sum(1)>1].replace(0,np.nan).stack()
CLASS  B   
Car    bal2    1.0
       bal3    2.0
Ship   bal1    2.0
       bal2    1.0
dtype: float64

非常感谢你-正是我想要的！谢谢你编辑我的文章。它帮助我以更精确的方式提问@我觉得我只是一个好公民。继续提问！

s=pd.crosstab(df.CLASS,df.B)
s[s.ne(0).sum(1)>1].replace(0,np.nan).stack()
CLASS  B   
Car    bal2    1.0
       bal3    2.0
Ship   bal1    2.0
       bal2    1.0
dtype: float64