Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/293.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 检查元素是否包含在一个或多个列表中_Python_Pandas - Fatal编程技术网

Python 检查元素是否包含在一个或多个列表中

Python 检查元素是否包含在一个或多个列表中,python,pandas,Python,Pandas,我需要帮助才能在一个命令中包含以下两个步骤: df['Col2'] = df['Col1'].apply(part_is_in, values = list_1) df['Col2'] = df['Col1'].apply(part_is_in, values = list_2) 其中list_1和list_2是字符串列表,以及 def part_is_in(x, values): output = 'No' for val in values: if val

我需要帮助才能在一个命令中包含以下两个步骤:

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1)
df['Col2'] = df['Col1'].apply(part_is_in, values = list_2)
其中
list_1
list_2
是字符串列表,以及

def part_is_in(x, values):
    output = 'No'
    for val in values:
        if val in x:
            return 'Yes'
            break                
    return output
我想检查
Col1
中的元素是否在
list_1
和/或
list_2
中。现在我正在使用顺序更新,但我想更改定义,以便检查某个值是否可以在更多列表中。我正在使用上面的函数检查其他列中的元素,并且我还需要只保留一个列表的大小写


任何帮助都将不胜感激。谢谢

熊猫有这样一个功能:

df[df['Col1'].isin(list1+list2)]['Col1']

“Col1”列中的返回元素比列表1中的返回元素具有以下功能:

df[df['Col1'].isin(list1+list2)]['Col1']
这将返回“Col1”列中的元素,而不是列表1中的元素

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1 + list_2)
试试这个

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1 + list_2)

如果您使用的是非常大的数据集,您应该避免使用自定义函数

# Assume the columns you want to look at is included in cols 
new_cols = [f'{item}_1' for item in cols]
for old_col, new_col in zip(cols, new_cols):
    # where values(iterable) is whatever you want to check
    # this checks if each value in column is in values(iterable)
    # overwrite old_col if you can, if not this will add lots of new columns
    df[new_col] = df[old_col].map(lambda item: item in values) 

# This will call any() function on all the rows, 
# recall that each element in row x is True or False that represents if  
# the original value (from old_col at row x) is in values
df['result'] = df[new_cols].map(lambda row: any(row.values), axis=1)

如果您使用的是非常大的数据集,您应该避免使用自定义函数

# Assume the columns you want to look at is included in cols 
new_cols = [f'{item}_1' for item in cols]
for old_col, new_col in zip(cols, new_cols):
    # where values(iterable) is whatever you want to check
    # this checks if each value in column is in values(iterable)
    # overwrite old_col if you can, if not this will add lots of new columns
    df[new_col] = df[old_col].map(lambda item: item in values) 

# This will call any() function on all the rows, 
# recall that each element in row x is True or False that represents if  
# the original value (from old_col at row x) is in values
df['result'] = df[new_cols].map(lambda row: any(row.values), axis=1)

嗨,克里斯蒂安,谢谢你的回答。我需要检查两个列表,1和2。我编辑了答案,这样使用矢量化numpy,比在loopHi Cristian中更快,谢谢你的回答。我需要同时签入列表1和列表2。我为此编辑了答案,这种方式使用矢量化numpy,比在循环中更快