Python 如何使用2列作为条件删除Pandas数据帧中的行?
基本上,我得到了如下表格:Python 如何使用2列作为条件删除Pandas数据帧中的行?,python,pandas,dataframe,Python,Pandas,Dataframe,基本上,我得到了如下表格: Name Sport Frequency Jonas Soccer 3 Jonas Tennis 5 Jonas Boxing 4 Mathew Soccer 2 Mathew Tennis 1 John Boxing 2 John Boxing
Name Sport Frequency
Jonas Soccer 3
Jonas Tennis 5
Jonas Boxing 4
Mathew Soccer 2
Mathew Tennis 1
John Boxing 2
John Boxing 3
John Soccer 1
假设这是一个标准表,我将使用groupby函数将其转换为一个DF,如下所示:
table = df.groupby(['Name'])
创建数据帧后,我想删除除足球以外的所有其他运动的频率大于足球频率的所有行
因此,我需要运行以下条件:
groupby
函数中使用)Name Sport Frequency
Jonas Soccer 3
Mathew Soccer 2
Mathew Tennis 1
John Soccer 1
感谢您的支持这是一种方法,通过反复访问以下组:
pd.concat(
[
value.assign(temp=lambda x: x.loc[x.Sport == "Soccer", "Frequency"])
.bfill()
.ffill()
.query("Frequency <= temp")
.drop('temp', axis = 1)
for key, value in df.groupby("Name").__iter__()
]
)
Name Sport Frequency
7 John Soccer 1
0 Jonas Soccer 3
3 Mathew Soccer 2
4 Mathew Tennis 1
非常好!
sport_dtype = pd.api.types.CategoricalDtype(categories=df.Sport.unique(), ordered=True)
df = df.astype({"Sport": sport_dtype})
(
df.sort_values(["Name", "Sport"], ascending=[False, True])
.assign(temp=lambda x: x.loc[x.Sport == "Soccer", "Frequency"])
.ffill()
.query("Frequency <= temp")
.drop('temp', axis = 1)
)
Name Sport Frequency
3 Mathew Soccer 2
4 Mathew Tennis 1
0 Jonas Soccer 3
7 John Soccer 1
index = (
df.assign(temp=lambda x: x.loc[x.Sport == "Soccer", "Frequency"])
.groupby("Name")
.pipe(lambda x: x.ffill().bfill())
.query("Frequency <= temp")
.index
)
df.loc[index]
Name Sport Frequency
0 Jonas Soccer 3
3 Mathew Soccer 2
4 Mathew Tennis 1
7 John Soccer 1
(df.assign(temp=df.Sport == "Soccer",
temp2=lambda x: x.groupby("Name").temp.transform("mean"),
)
.query('Sport=="Soccer" or temp2>=0.5')
.iloc[:, :3]
)