Python 熊猫：尝试基于for循环删除行？_Python_Pandas_For Loop_Filter

Python 熊猫：尝试基于for循环删除行？

python pandas for-loop filter

Python 熊猫：尝试基于for循环删除行？,python,pandas,for-loop,filter,Python,Pandas,For Loop,Filter,我有一个数据帧，它由多个列组成，然后是两个列，x和y，它们都由1到3的数字填充。我想删除x中的数字小于y中的数字的所有行。例如，如果在一行中x=1，y=3，我想删除整行。这是我迄今为止编写的代码： for num1 in df.x: for num2 in df.y: if (num1< num2): df.drop(df.iloc[num1], inplace = True) 非常感谢您的帮助。谢谢我会在您的场景中避免循环，只需使用。d

我有一个数据帧，它由多个列组成，然后是两个列，x和y，它们都由1到3的数字填充。我想删除x中的数字小于y中的数字的所有行。例如，如果在一行中x=1，y=3，我想删除整行。这是我迄今为止编写的代码：

for num1 in df.x:
    for num2 in df.y:
        if (num1< num2):
            df.drop(df.iloc[num1], inplace = True)

非常感谢您的帮助。谢谢

我会在您的场景中避免循环，只需使用

。drop

：

df.drop(df[df['x'] < df['y']].index, inplace=True)

df.drop（df[df['x']


例如：
df = pd.DataFrame({'x':np.random.randint(0,4,5), 'y':np.random.randint(0,4,5)})

>>> df
   x  y
0  1  2
1  2  1
2  3  1
3  2  1
4  1  3

df.drop(df[df['x'] < df['y']].index, inplace = True)

>>> df
   x  y
1  2  1
2  3  1
3  2  1

df=pd.DataFrame（{'x'：np.random.randint（0,4,5），'y'：np.random.randint（0,4,5）}）
>>>df
xy
0  1  2
1  2  1
2  3  1
3  2  1
4  1  3
drop（df[df['x']>>df
xy
1  2  1
2  3  1
3  2  1

[编辑]：或者更简单地说，不使用drop：
df=df[~(df['x'] < df['y'])]

df=df[~（df['x']
编写两个for循环是非常无效的，相反，您可以
比较一下这两列
[df['x'] >= df['y']]

这些函数返回一个布尔数组，您可以使用它来过滤数据帧
df[df['x'] >= df['y']]

我认为最好是使用或将条件更改为=
：
df[df['x'] >= df['y']]

或：
样本：
df = pd.DataFrame({'x':[1,2,3,2], 'y':[0,4,5,1]})
print (df)
   x  y
0  1  0
1  2  4
2  3  5
3  2  1

df = df[df['x'] >= df['y']]
print (df)
   x  y
0  1  0
3  2  1

您可以发布一个小的可复制数据集（以text/CSV/Python代码形式）和您想要的数据集吗？但为什么不只过滤？我觉得没必要在这里下车。是的，没错。我只是按照OP的一般方法，但你可以做df[~（df['x']
，将实现相同的（或其他答案中概述的其他过滤方法之一）嗯，我认为drop
是过度复杂的解决方案，~
具有
更好，但最好的code>=/code>条件。@jezrael，我刚刚用一个大的随机df
（drop
，filter with~
，filter with=/code>，和query
）对这4种方法进行了计时，drop
和query
都非常慢，而
和=/code>的过滤速度非常接近，=稍微快一点（所以你是对的）。关键字是“略微”（平均快10e-4秒）@sacul-是的，它在这里更慢，因为调用更多函数：）
df = df.query('x >= y')

df = pd.DataFrame({'x':[1,2,3,2], 'y':[0,4,5,1]})
print (df)
   x  y
0  1  0
1  2  4
2  3  5
3  2  1

df = df[df['x'] >= df['y']]
print (df)
   x  y
0  1  0
3  2  1