Python 验证dataframe列数据

Python 验证dataframe列数据,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个下面的伪代码,我需要用pandas编写 if group_min_size && group_max_size if group_min_size == 0 && group_max_size > 0 if group_max_size >= 2 errors.add(:group_min_size, "must be greater than or equal to 2 and less tha

我有一个下面的伪代码,我需要用pandas编写

if group_min_size && group_max_size
      if group_min_size == 0 && group_max_size > 0
        if group_max_size >= 2
          errors.add(:group_min_size, "must be greater than or equal to 2 and less than or equal to group_max_size (#{group_max_size})")
        end

        if group_max_size < 2
          errors.add(:group_min_size, "must be greater than 2")
          errors.add(:group_max_size, "must be greater than 2")
        end
      end

      if group_min_size > 0 && group_max_size == 0
        if group_min_size >= 2
          errors.add(:group_max_size, "must be greater than or equal to #{group_min_size}")
        end

        if group_min_size < 2
          errors.add(:group_min_size, "must be greater than 2")
          errors.add(:group_max_size, "must be greater than 2")
        end
      end
    end
这是给你的

if group_min_size == 0 && group_max_size > 0
        if group_max_size >= 2
          errors.add(:group_min_size, "must be greater than or equal to 2 and less than or equal to group_max_size (#{group_max_size})")
        end
但并不像预期的那样有效

以下是我的测试数据-

   group_min_size  group_max_size
0             0.0             1.0
1            10.0            20.0
2             0.0             3.0
3             3.0             0.0
4             NaN             NaN
5             2.0             2.0
6             2.0             2.0
7             2.0             2.0
8             2.0             2.0
根据psudo代码逻辑,输出应为:

False
True 
False
False
True
True
True
True
True

我该如何用熊猫来写这个逻辑呢?

请一步一步地回答你的问题。从创建布尔型开始:

min_equal_0 = df['group_min_size'] == 0
min_above_0 = df['group_min_size'] > 0
min_above_equal_2 = df['group_min_size'] >= 2
min_below_2 = df['group_min_size'] < 2

max_equal_0 = df['group_max_size'] == 0
max_above_0 = df['group_max_size'] > 0
max_above_equal_2 = df['group_max_size'] >= 2
max_below_2 = df['group_max_size'] < 2
如果我们将两者结合起来:

>> first_mask & second_mask

0    False
1     True
2    False
3    False
4     True
5     True
6     True
7     True
8     True
dtype: bool
如果要将
NaN
视为
False
,只需添加它们:

min_is_not_null = df['group_min_size'].notnull()
max_is_not_null = df['group_max_size'].notnull()
>> min_is_not_null & max_is_not_null & first_mask & second_mask
0    False
1     True
2    False
3    False
4    False
5     True
6     True
7     True
8     True
dtype: bool

@MohamedThasinah我已经提到了我的尝试。打破不同的假设。并提供了代码实现,第四也将是真实的感谢解释。我想把所有的东西都放在一起。这对我来说太复杂了。对于任何进一步的验证,我将从这个方法开始。np!编写伪代码是很好的,但我认为您嵌套IF语句的事实让它很混乱。
>> first_mask & second_mask

0    False
1     True
2    False
3    False
4     True
5     True
6     True
7     True
8     True
dtype: bool
min_is_not_null = df['group_min_size'].notnull()
max_is_not_null = df['group_max_size'].notnull()
>> min_is_not_null & max_is_not_null & first_mask & second_mask
0    False
1     True
2    False
3    False
4    False
5     True
6     True
7     True
8     True
dtype: bool