Python 删除其他列中仅与一个值相关的行
假设我有这样的数据帧:Python 删除其他列中仅与一个值相关的行,python,pandas,dataframe,Python,Pandas,Dataframe,假设我有这样的数据帧: item name gender banana tom male banana kate female apple kate female kiwi jim male apple tom male banana kimmy female kiwi kate female banana tom male item name ge
item name gender
banana tom male
banana kate female
apple kate female
kiwi jim male
apple tom male
banana kimmy female
kiwi kate female
banana tom male
item name gender
banana tom male
banana kate female
apple kate female
apple tom male
kiwi kate female
banana tom male
是否有任何方法可以删除仅关联(购买)少于2项的行?另外,我不想删除重复项。所以我想要的输出如下:
item name gender
banana tom male
banana kate female
apple kate female
kiwi jim male
apple tom male
banana kimmy female
kiwi kate female
banana tom male
item name gender
banana tom male
banana kate female
apple kate female
apple tom male
kiwi kate female
banana tom male
@sammywemmy的解决方案:
df.loc[df.groupby('name').item.transform('size').ge(2)]
#获取每个组
打印(df.groupby('name').apply(lambda s:s.reset_index()))
#将每个项目转换为组中的行数
df['group_size']=df.groupby('name')['item'].transform('size'))
打印(df)
在这种情况下,可以在任何列上执行此操作:
# Turn Each Item Into The Number of Rows in The Group
df['group_size'] = df.groupby('name')['gender'].transform('size')
print(df)
请注意,现在每行的末尾都有相应的组大小tom
有3个实例,因此每个name==tom
行在group\u size
中有3个实例
#添加条件以确定是否应保留该行
df['should_keep']=df.groupby('name')['item'].transform('size').ge(2)
打印(df)
print(df.groupby('name')['item'].transform('size').ge(2))
loc
将包括True
的任何索引,将排除False
的任何索引。(索引3和5为False
,因此不包括它们)
总而言之:
将熊猫作为pd导入
df=pd.DataFrame({'item':{0:'香蕉',1:'香蕉',2:'苹果',
3:'猕猴桃',4:'苹果',5:'香蕉',
6:'猕猴桃',7:'香蕉'},
'name':{0:'汤姆',1:'凯特',2:'凯特',
3:'吉姆',4:'汤姆',5:'吉米',
6:‘凯特’,7:‘汤姆’,
'性别':{0:'男性',1:'女性',
2:'女性',3:'男性',
4:'男性',5:'女性',
6:'女性',7:'男性'})
打印(df.loc[df.groupby('name')['name'].transform('size').ge(2)])
df.loc[df.groupby('name').item.transform('size').ge(2)]
?谢谢你,它能工作!你能回答这个问题吗?如果你能稍加解释就太好了。非常感谢!这对我来说更清楚了。
item name gender group_size
0 banana tom male 3
1 banana kate female 3
2 apple kate female 3
3 kiwi jim male 1
4 apple tom male 3
5 banana kimmy female 1
6 kiwi kate female 3
7 banana tom male 3
item name gender group_size should_keep
0 banana tom male 3 True
1 banana kate female 3 True
2 apple kate female 3 True
3 kiwi jim male 1 False
4 apple tom male 3 True
5 banana kimmy female 1 False
6 kiwi kate female 3 True
7 banana tom male 3 True
0 True
1 True
2 True
3 False
4 True
5 False
6 True
7 True
Name: item, dtype: bool
item name gender
0 banana tom male
1 banana kate female
2 apple kate female
4 apple tom male
6 kiwi kate female
7 banana tom male