Python 如果两列在三个列中具有相同的值,则无法在中获取结果,然后保留第一列值,否则保留其他值,具体取决于条件

Python 如果两列在三个列中具有相同的值,则无法在中获取结果,然后保留第一列值,否则保留其他值,具体取决于条件,python,python-3.x,pandas,numpy,Python,Python 3.x,Pandas,Numpy,我有一个dataframe,其值如下所示: AF_SC TB_SC VS_SC negative negative negative positive positive positive neutral negative negative negative negative positive positive positive neutral negative negative positiv

我有一个dataframe,其值如下所示:

AF_SC       TB_SC       VS_SC   
negative    negative    negative
positive    positive    positive
neutral     negative    negative
negative    negative    positive
positive    positive    neutral
negative    negative    positive
neutral     positive    neutral
negative    positive    positive
negative    positive    neutral
我试图做的是获得一个结果列,该列将具有基于以下条件的值:

1. if values in col AF_SC and TB_SC are same, then 'result' col will have values of AF_SC (or TB_SC, as both are same)

2. if values in col TB_SC and VS_SC are same, then 'result' col will have values of TB_SC (or VS_SC, as both are same)

3. if values in col AF_SC and VS_SC are same, then 'result' col will have values of AF_SC (or VS_SC, as both are same)

4. otherwise 'result' col will have values as 'neutral'
换句话说,如果三列中有两列具有相同的值,表示“负”,则“结果”列将具有“负”,同样,如果三列中有两列具有相同的值,表示“正”,则“结果”列将具有“正”值,如果一列具有“正”,则另一列具有“负”,第三列具有“中性”(即3列中的所有三个不同值),则“结果”列的值为“中性”

结果DF应如下所示:

AF_SC       TB_SC       VS_SC       Result
negative    negative    negative    negative
positive    positive    positive    positive
neutral     negative    negative    negative
negative    negative    positive    negative
positive    positive    neutral     positive
negative    negative    positive    negative
neutral     positive    neutral     neutral
negative    positive    positive    positive
negative    positive    neutral     neutral
我试图使用np实现这一点。其中方法:

df['result'] = np.where((df['AF_SC'] == df['TB_SC']) or (df['AF_SC'] == df['VS_SC']), df['AF_SC'], 
                         np.where((df['TB_SC'] == df['VS_SC']), df['TB_SC'], "neutral"))
不幸的是,它给了我一个错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
不知道我犯了什么样的错误

除了我想要实现的结果之外,还有其他选择吗?

这里是可能的用法,按位进行链
或使用

m1 = df['AF_SC'] == df['TB_SC']
m2 = df['AF_SC'] == df['VS_SC']
m3 = df['TB_SC'] == df['VS_SC']
df['result'] = np.select([m1 | m2, m3], [df['AF_SC'], df['TB_SC']], "neutral")
应更改您的解决方案:

df['result'] = np.where((df['AF_SC'] == df['TB_SC']) | 
                         (df['AF_SC'] == df['VS_SC']), df['AF_SC'], 
               np.where((df['TB_SC'] == df['VS_SC']), df['TB_SC'], "neutral"))
print (df)
      AF_SC     TB_SC     VS_SC    result
0  negative  negative  negative  negative
1  positive  positive  positive  positive
2   neutral  negative  negative  negative
3  negative  negative  positive  negative
4  positive  positive   neutral  positive
5  negative  negative  positive  negative
6   neutral  positive   neutral   neutral
7  negative  positive  positive  positive
8  negative  positive   neutral   neutral
以下是可能的用法,对于按位
使用的链

m1 = df['AF_SC'] == df['TB_SC']
m2 = df['AF_SC'] == df['VS_SC']
m3 = df['TB_SC'] == df['VS_SC']
df['result'] = np.select([m1 | m2, m3], [df['AF_SC'], df['TB_SC']], "neutral")
应更改您的解决方案:

df['result'] = np.where((df['AF_SC'] == df['TB_SC']) | 
                         (df['AF_SC'] == df['VS_SC']), df['AF_SC'], 
               np.where((df['TB_SC'] == df['VS_SC']), df['TB_SC'], "neutral"))
print (df)
      AF_SC     TB_SC     VS_SC    result
0  negative  negative  negative  negative
1  positive  positive  positive  positive
2   neutral  negative  negative  negative
3  negative  negative  positive  negative
4  positive  positive   neutral  positive
5  negative  negative  positive  negative
6   neutral  positive   neutral   neutral
7  negative  positive  positive  positive
8  negative  positive   neutral   neutral
使用熊猫的本地where():

使用熊猫的本地where():


你真的应该避免使用
numpy.where
像这样,使用更惯用的替代方法。@Alexander Cécile-但为什么?这是一个“精益”的编码尝试,不是吗?也许你可以帮我举一个更好的例子,因为我正在学习python:)精益尝试是什么意思?精益就像“性能更好的代码”或“不太复杂的代码”一样好吧,有道理。我明天会尝试寻找一个解决方案:)你真的应该避免使用
numpy.where
像这样,而是更惯用的替代方法。@AlexanderCécile-但为什么?这是一个“精益”的编码尝试,不是吗?也许你可以帮我举一个更好的例子,因为我正在学习python:)你所说的精益尝试是什么意思?精益就像“性能更好的代码”或“不太复杂的代码”一样,啊,好吧,是有道理的。明天我将尝试寻找一个解决方案:)与耶兹雷尔提出的第一种方法相比,这有什么好处?它显然更清晰(在逻辑方面)而且更短。就性能而言,我想这不会有多大区别(但没有检查)。此外,作业中还存在优先级问题(即条件可能会覆盖彼此的结果)。通过这种方式,我们保证了正确的赋值顺序。与耶斯雷尔建议的第一种方法相比,这种方法有什么好处?它显然更清晰(在逻辑方面)且更短。就性能而言,我想这不会有多大区别(但没有检查)。此外,作业中还存在优先级问题(即条件可能会覆盖彼此的结果)。这样我们就保证了正确的分配顺序。