Python 基于多个字符串条件为新列赋值 我所拥有的: 我想创造的是: 我所尝试的:

Python 基于多个字符串条件为新列赋值 我所拥有的: 我想创造的是: 我所尝试的:,python,pandas,function,dataframe,keyerror,Python,Pandas,Function,Dataframe,Keyerror,目前,我得到了一个空白的“大”列用于此多个if/elif语句,您可以使用np。选择: choices = ['True Positive','False Negative','False Positive'] conditions = [ ((df['Actual_Size'].isin(['BIG']))&(df['Possible_Size'].isin(['BIG']))), ((df['Actual_Size'].isin(['BIG']))&

目前,我得到了一个空白的“大”列

用于此多个if/elif语句,您可以使用
np。选择

choices = ['True Positive','False Negative','False Positive']
conditions = [
       ((df['Actual_Size'].isin(['BIG']))&(df['Possible_Size'].isin(['BIG']))), 
       ((df['Actual_Size'].isin(['BIG']))&(df['Possible_Size'].isin(['MEDIUM', 'SMALL']))),
       ((df['Actual_Size'].isin(['MEDIUM', 'SMALL']))&(df['Possible_Size'].isin(['BIG'])))]
import numpy as np
df['Big'] = np.select(conditions, choices, default='')

如果要保留原始解决方案,问题是在逐行应用函数时没有返回任何内容,因此可以尝试以下方法:

def sizes(row):

    if row['Actual_Size'] in ['BIG'] and row['Possible_Size'] in ['BIG']:
        return'True Positive'
    elif row['Actual_Size'] in ['BIG'] and row['Possible_Size'] in ['MEDIUM', 'SMALL']:
        return 'False Negative'
    elif row['Actual_Size'] in ['MEDIUM', 'SMALL'] and row['Possible_Size'] in ['BIG']:
        return 'False Positive'  
    else:
        return ''

df['Big']=df.apply(sizes, axis=1)
两项产出:

df
     ID Possible_Size Actual_Size             Big
0  1234           BIG         BIG   True Positive
1  5678        MEDIUM         BIG  False Negative
2  9876           BIG       SMALL  False Positive
3  1092        MEDIUM      MEDIUM                

=
不是
=
是的,谢谢,我已经尝试了=和==。更改后,我现在只得到一个空白的“大”列。您是否可以尝试打印
行['Actual_Size']
以查看变量中的内容。我最好的猜测是字符串中可能有尾随空格。如果有尾随空格,您可能需要在进行比较之前执行
strip()
。是的,非常感谢!只是出于兴趣,您有什么理由使用一种解决方案而不是另一种解决方案吗?对于这样的多个条件,我会使用
np.select
,因为您可能知道numpy提供了更好的性能,而另一个选项(
apply
)有时会出现。
choices = ['True Positive','False Negative','False Positive']
conditions = [
       ((df['Actual_Size'].isin(['BIG']))&(df['Possible_Size'].isin(['BIG']))), 
       ((df['Actual_Size'].isin(['BIG']))&(df['Possible_Size'].isin(['MEDIUM', 'SMALL']))),
       ((df['Actual_Size'].isin(['MEDIUM', 'SMALL']))&(df['Possible_Size'].isin(['BIG'])))]
import numpy as np
df['Big'] = np.select(conditions, choices, default='')
def sizes(row):

    if row['Actual_Size'] in ['BIG'] and row['Possible_Size'] in ['BIG']:
        return'True Positive'
    elif row['Actual_Size'] in ['BIG'] and row['Possible_Size'] in ['MEDIUM', 'SMALL']:
        return 'False Negative'
    elif row['Actual_Size'] in ['MEDIUM', 'SMALL'] and row['Possible_Size'] in ['BIG']:
        return 'False Positive'  
    else:
        return ''

df['Big']=df.apply(sizes, axis=1)
df
     ID Possible_Size Actual_Size             Big
0  1234           BIG         BIG   True Positive
1  5678        MEDIUM         BIG  False Negative
2  9876           BIG       SMALL  False Positive
3  1092        MEDIUM      MEDIUM