Python 使用通配符对多个列进行计数_Python_Pandas_Countif

Python 使用通配符对多个列进行计数

python pandas

Python 使用通配符对多个列进行计数,python,pandas,countif,Python,Pandas,Countif,我想复制Excel中的数据集我的python代码如下所示： data_frames = [df_mainstore, df_store_A, df_store_B] df_merged = reduce(lambda left,right: pd.merge(left,right,on=["Id_number"], how='outer'), data_frames) print(df_merged) 由于我合并了多个数据帧，它们的列号和名称可能会有所不同，因此写下所有列也会很枯燥：我也

我想复制Excel中的数据集

我的python代码如下所示：

data_frames = [df_mainstore, df_store_A, df_store_B]
df_merged = reduce(lambda  left,right: pd.merge(left,right,on=["Id_number"], how='outer'), data_frames)
print(df_merged)

由于我合并了多个数据帧，它们的列号和名称可能会有所不同，因此写下所有列也会很枯燥：

我也很难理解表达式：isY=lambda x:intx=='@'

如何以与Excel中类似的方式添加Contact has Email列？

您可以使用过滤器选择其中包含Contact的列，然后使用右侧的str.contains，最后您希望每行包含任何联系人，因此：

#data sample
df_merged = pd.DataFrame({'id': [0,1,2,3], 
                          'Store A': list('abcd'),
                          'Store Contact A':['aa@bb.cc', '', 'e', 'f'], 
                          'Store B': list('ghij'),
                          'Store B Contact':['kk@ll.m', '', 'nn@ooo.pp', '']})

# define the pattern as in the link
pat = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
# create the column as wanted
df_merged['Contact has Email'] = df_merged.filter(like='Contact')\
                                          .apply(lambda x: x.str.contains(pat))\
                                          .any(1)

print (df_merged)
   id Store A Store Contact A Store B Store B Contact  Contact has Email
0   0       a        aa@bb.cc       g         kk@ll.m               True
1   1       b                       h                              False
2   2       c               e       i       nn@ooo.pp               True
3   3       d               f       j                              False

你可以使用

非常感谢你！！！！！！！！非常感谢。为了使它100%工作，我需要调整所有单词Contact都以大写字母C开头。此外，我使用正则表达式：[a-zA-Z0-9-ȫ]+@[a-zA-Z0-9-ȫ]+使它更具动态性。真的很高兴所有的帮助和投入：@Wizhi如果您不总是使用大写字母C，您也可以使用filterregex='Contact | Contact'作为示例，但我相信还有更灵活的方法。很高兴它能帮上忙：@Wizhi您可以尝试df_merged['Store Contact A'].str.extractpat，其中pat稍有不同pat=r[A-zA-Z0-9-\.]+@[A-zA-Z0-9-\.]+注意您所做的正则表达式，但是如果您想要一个有效的解决方案，也许一个新问题会更好谢谢您的回答！！我会尽量考虑你的建议。如果我不解决它，我会问一个新问题。再次感谢！！：@Wizhi不确定您希望结果如何，但此df_合并了。filterlike='Contact'。applylambda x:x.str.extractpat[0]。agglist，axis=1提供了每行所有电子邮件的列表

#data sample
df_merged = pd.DataFrame({'id': [0,1,2,3], 
                          'Store A': list('abcd'),
                          'Store Contact A':['aa@bb.cc', '', 'e', 'f'], 
                          'Store B': list('ghij'),
                          'Store B Contact':['kk@ll.m', '', 'nn@ooo.pp', '']})

# define the pattern as in the link
pat = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
# create the column as wanted
df_merged['Contact has Email'] = df_merged.filter(like='Contact')\
                                          .apply(lambda x: x.str.contains(pat))\
                                          .any(1)

print (df_merged)
   id Store A Store Contact A Store B Store B Contact  Contact has Email
0   0       a        aa@bb.cc       g         kk@ll.m               True
1   1       b                       h                              False
2   2       c               e       i       nn@ooo.pp               True
3   3       d               f       j                              False

df_merged['Contact has Email'] = df_merged['Store Contact A'].str.contains('@', na=False)|df_merged['Store B Contact'].str.contains('@', na=False)