Python If、elif和else在数据框中创建新列_Python_Regex_Pandas_If Statement_Series

Python If、elif和else在数据框中创建新列

python regex pandas if-statement

Python If、elif和else在数据框中创建新列,python,regex,pandas,if-statement,series,Python,Regex,Pandas,If Statement,Series,正在尝试在数据帧调用“方法”中创建新列。所附图片中的当前数据帧：我试图使用if/elif/else以及regex来创建新列，但是当我运行这段代码时，我只得到来自else语句的值。为什么这不起作用？我该如何修复它 if 'posted' in df2.Full.astype(str) and '/ Outbound' in df2.TPrev.astype(str): df2['Method']='Classifieds Homepage Button' elif 'ad posted'

正在尝试在数据帧调用“方法”中创建新列。所附图片中的当前数据帧：

我试图使用if/elif/else以及regex来创建新列，但是当我运行这段代码时，我只得到来自else语句的值。为什么这不起作用？我该如何修复它

if 'posted' in df2.Full.astype(str) and '/ Outbound' in df2.TPrev.astype(str):
    df2['Method']='Classifieds Homepage Button'
elif 'ad posted' in df2.Full.astype(str) and 'thanks' in df2.TPrev.astype(str):
    df2['Method']='Header after Post'
elif 'ad posted' in df2.Full.astype(str) and '/myaccount/listing-classified Outbound' in df2.TPrev.astype(str):
    df2['Method']='My Listings Button'    
elif 'ad posted' in df2.Full.astype(str) and '/s/' in df2.TPrev.astype(str):
    df2['Method']='SRP'  
elif 'ad posted' in df2.Full.astype(str) and '/myaccount/listing-classified nan' in df2.TPrev.astype(str):
    df2['Method']='My Listings Button'
elif 'ad posted' in df2.Full.astype(str) and '/sell nan nan' in df2.TPrev and '/myaccount/listing-classified nan nan' in df2.Prev.astype(str):
    df2['Method']='My Listings Header'
elif 'ad posted' in df2.Full.astype(str) and '/listing/' in df2.TPrev.astype(str):
    df2['Method']='Detail Page Header'
elif 'ad posted' in df2.Full.astype(str) and '/search/' in df2.TPrev.astype(str):
    df2['Method']='SRP'
else:
    df2['Method']='Ignore'

正如评论中的人所建议的，问题是当你给一列指定一个值时，你只需重写所有列，使其具有与你指定的值相同的值。您要做的是：

不要将类型更改为str every row，只需更改整个数据帧：

df2.astype（str）

您需要在数据帧的每一行上使用一个逻辑来确定“Method”列的值。最简单的方法是使用您构建的函数并使用apply调用它：

这将是最简单的转换，但我认为更优雅的解决方案将是使用np。选择-根据逻辑创建一个选择列表和一个真/假列表。前3个条件的示例：

conditions = [ ('posted' in df2.Full) & ('/ Outbound' in df2.TPrev), ('ad posted' in df2.Full) & ('thanks' in df2.TPrev), ('ad posted' in df2.Full) & ('/myaccount/listing-classified Outbound' in df2.TPrev)] choices = ['"Classified Homepage Button"', 'Header after Post', 'My Listings Button'] df2['Method'] = np.select(conditions, choices, default='Ignore')

您的逻辑语句对整个数据帧求值为一个真值，因此您只能将整个列设置为其中一个值。相反，您应该使用
np。选择类似于中的
在每行应用条件逻辑。您可能还希望将语句切换为
df2.Full.astype（str）.str.contains（'ad posted'）
，因为这些语句返回布尔序列。正如上面的注释所述，每次执行都会覆盖
df2['Method']
。除了
np.选择
，您可以先创建一个空的
df2['Method']
列，然后使用循环填写您的条件。您没有阅读熊猫文档吗？这是否回答了您的问题？
conditions = [ ('posted' in df2.Full) & ('/ Outbound' in df2.TPrev), ('ad posted' in df2.Full) & ('thanks' in df2.TPrev), ('ad posted' in df2.Full) & ('/myaccount/listing-classified Outbound' in df2.TPrev)] choices = ['"Classified Homepage Button"', 'Header after Post', 'My Listings Button'] df2['Method'] = np.select(conditions, choices, default='Ignore')