Python 熊猫-检查任何不匹配的记录
我正在尝试使用以下条件进行匹配: 输入:Python 熊猫-检查任何不匹配的记录,python,pandas,dataframe,Python,Pandas,Dataframe,我正在尝试使用以下条件进行匹配: 输入: If df1['Cntr_No'] == df2['Cntr_No'] check if df1['Total_Amount'] == df2['Total_Amount'] else check if df1['Total_Amount'] == df2['Amount2'] or == df2['Amount3'] If a match to create a new column "Match" with value =
If df1['Cntr_No'] == df2['Cntr_No']
check if df1['Total_Amount'] == df2['Total_Amount']
else check if df1['Total_Amount'] == df2['Amount2'] or == df2['Amount3']
If a match to create a new column "Match" with value = "Yes" or "No" for unmatched.
样本数据:
df1 = pd.DataFrame({'Cntr_No': ['HLBU 1234567'],'Total_Amount': 100})
df2 = pd.DataFrame({'Cntr_No': ['HLBU 1234567'],'Total_Amount': 50,'Amount_2': 40, 'Amount_3':100})
一行中的示例输出:
df1: HLBU 1234567 | df1: Total Amount: 100 | df2: HLBU 1234567 | df2:
Total Amount: 50 | df2: Amount 2: 40 | df2: Amount 3: 100 | Matched
一种方法是使用字典映射,然后使用列表理解:
cols = ['Amount_2', 'Amount_3', 'Total_Amount']
d = {k: set(v.values()) for k, v in \
df2.set_index('Cntr_No')[cols].to_dict(orient='index').items()}
df1['Check'] = [j in d.get(i, set()) for i, j in zip(df1['Cntr_No'], df1['Total_Amount'])]
df1['Check'] = df1['Check'].map({True: 'Match', False: 'No'})
结果:
Cntr_No Total_Amount Check
0 HLBU 1234567 100 Match
一种方法是使用字典映射,然后使用列表理解:
cols = ['Amount_2', 'Amount_3', 'Total_Amount']
d = {k: set(v.values()) for k, v in \
df2.set_index('Cntr_No')[cols].to_dict(orient='index').items()}
df1['Check'] = [j in d.get(i, set()) for i, j in zip(df1['Cntr_No'], df1['Total_Amount'])]
df1['Check'] = df1['Check'].map({True: 'Match', False: 'No'})
结果:
Cntr_No Total_Amount Check
0 HLBU 1234567 100 Match
我认为使用
isin
很简单:
In [504]: df2['Check'] = ((df1.Cntr_No.isin(df2.Cntr_No))&((df1.Total_Amount.isin(df2.Amount_2))|(df1.Total_Amount.isin(df2.Amount_3))|(df1.Total_Amount.isin(df2.Total_Amount)))).map({True:'Match',False:'No'})
In [505]: df2
Out[505]:
Amount_2 Amount_3 Cntr_No Total_Amount Check
0 40 100 HLBU 1234567 50 Match
我认为使用
isin
很简单:
In [504]: df2['Check'] = ((df1.Cntr_No.isin(df2.Cntr_No))&((df1.Total_Amount.isin(df2.Amount_2))|(df1.Total_Amount.isin(df2.Amount_3))|(df1.Total_Amount.isin(df2.Total_Amount)))).map({True:'Match',False:'No'})
In [505]: df2
Out[505]:
Amount_2 Amount_3 Cntr_No Total_Amount Check
0 40 100 HLBU 1234567 50 Match
您想要的输出是什么?@zipa使用示例所需输出更新帖子,在一行中显示所有df1和df2数据,在最后一列中显示“匹配”或“不匹配”您想要的输出是什么?@zipa使用示例所需输出更新帖子,在一行中显示所有df1和df2数据,在最后一列中显示“匹配”或“其他”“没有对手“回答得好!对于可能列出df2 Cntr_No和匹配/不匹配数量的结果?可能是可能的,是的,但我认为这是另一个问题。答案很好!对于可能列出df2 Cntr_No和匹配/不匹配金额的结果?可能是的,是的,但我认为这是另一个问题。谢谢它也很好用,代码对于我的级别更容易理解。谢谢它也很好用,代码对于我的级别更容易理解。