Python 检查元素是否包含在一个或多个列表中_Python_Pandas

Python 检查元素是否包含在一个或多个列表中

python pandas

Python 检查元素是否包含在一个或多个列表中,python,pandas,Python,Pandas,我需要帮助才能在一个命令中包含以下两个步骤： df['Col2'] = df['Col1'].apply(part_is_in, values = list_1) df['Col2'] = df['Col1'].apply(part_is_in, values = list_2) 其中list_1和list_2是字符串列表，以及 def part_is_in(x, values): output = 'No' for val in values: if val

我需要帮助才能在一个命令中包含以下两个步骤：

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1)
df['Col2'] = df['Col1'].apply(part_is_in, values = list_2)

其中

list_1

和

list_2

是字符串列表，以及

def part_is_in(x, values):
    output = 'No'
    for val in values:
        if val in x:
            return 'Yes'
            break                
    return output

我想检查

Col1

中的元素是否在

list_1

和/或

list_2

中。现在我正在使用顺序更新，但我想更改定义，以便检查某个值是否可以在更多列表中。我正在使用上面的函数检查其他列中的元素，并且我还需要只保留一个列表的大小写

任何帮助都将不胜感激。谢谢

熊猫有这样一个功能：

df[df['Col1'].isin(list1+list2)]['Col1']

“Col1”列中的返回元素比列表1中的返回元素具有以下功能：

df[df['Col1'].isin(list1+list2)]['Col1']

这将返回“Col1”列中的元素，而不是列表1中的元素

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1 + list_2)

试试这个

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1 + list_2)

如果您使用的是非常大的数据集，您应该避免使用自定义函数

# Assume the columns you want to look at is included in cols 
new_cols = [f'{item}_1' for item in cols]
for old_col, new_col in zip(cols, new_cols):
    # where values(iterable) is whatever you want to check
    # this checks if each value in column is in values(iterable)
    # overwrite old_col if you can, if not this will add lots of new columns
    df[new_col] = df[old_col].map(lambda item: item in values) 

# This will call any() function on all the rows, 
# recall that each element in row x is True or False that represents if  
# the original value (from old_col at row x) is in values
df['result'] = df[new_cols].map(lambda row: any(row.values), axis=1)

如果您使用的是非常大的数据集，您应该避免使用自定义函数

# Assume the columns you want to look at is included in cols 
new_cols = [f'{item}_1' for item in cols]
for old_col, new_col in zip(cols, new_cols):
    # where values(iterable) is whatever you want to check
    # this checks if each value in column is in values(iterable)
    # overwrite old_col if you can, if not this will add lots of new columns
    df[new_col] = df[old_col].map(lambda item: item in values) 

# This will call any() function on all the rows, 
# recall that each element in row x is True or False that represents if  
# the original value (from old_col at row x) is in values
df['result'] = df[new_cols].map(lambda row: any(row.values), axis=1)

嗨，克里斯蒂安，谢谢你的回答。我需要检查两个列表，1和2。我编辑了答案，这样使用矢量化numpy，比在loopHi Cristian中更快，谢谢你的回答。我需要同时签入列表1和列表2。我为此编辑了答案，这种方式使用矢量化numpy，比在循环中更快