Python 如何匹配和合并两个数据帧,这两个数据帧的值除了一个单词外完全不同?

Python 如何匹配和合并两个数据帧,这两个数据帧的值除了一个单词外完全不同?,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,有一个有价值的数据帧ABC 0 1 2 0 sun is rising | UNKNOWN | 1465465 1 micheal has arrived | UNKNOWN | 324654 2 goal has been scored | UNKNOWN | 547854 和其他XYZ值 0 1 0 sun | pa

有一个有价值的数据帧ABC

        0                        1           2
0   sun is rising         |  UNKNOWN    | 1465465
1   micheal has arrived   |   UNKNOWN   | 324654
2   goal has been scored | UNKNOWN     | 547854
和其他XYZ值

    0         1 
0 sun       | password1
1 goal      | password2
2 micheal   | password3
如何将XYZ映射到(sun、goal和micheal)ABC,以便使用密码1替换ABC中的未知1

我需要的输出

    0                        1           2
0  sun is rising         |  password1    | 1465465
1   micheal has arrived  |   password3   | 324654
2   goal has been scored| password2     | 547854

以下是使用
str.contains
boolean indexation
选择两个数据帧之间匹配的密码的方法:

from itertools import chain
abc.loc[:,1] = list(chain(*[xyz.loc[abc[0].str.contains(i),1] for i in xyz[0]]))

         0                  1         2
0  sun is rising         password1  1465465
1  goal has been scored  password2   324654
2  micheal has arrived   password3   547854

创建字典并通过
get
next
匹配第一个值:

d = dict(zip(XYZ[0], XYZ[1]))
ABC[1] = [next(d.get(y) for y in x.split() if y in d) for x in ABC[0]]
print (ABC)
                      0          1        2
0         sun is rising  password1  1465465
1   micheal has arrived  password3   547854
2  goal has been scored  password2   324654


所以您只想替换第1列上的密码,所以实际密码?已编辑。希望密码1在sun in sun is rising ABC 1中显示希望的输出。是的。编辑@nisheetpatel以澄清行是否应该仅按其顺序进行映射?还是有必要在所有可能的行中搜索匹配项?由于我的解决方案假设已编辑,请检查。它没有映射到订单上。这是由XYZ 0列上的值将0更改为数据帧的实际列名ValueError:值的长度与索引的长度不匹配我是否要就此提出一个新问题?@Hukkemaru-我认为问题出在数据中。因此,需要创建问题或修改此问题中的数据,从而引发错误。。。因为问题中的数据工作正常。获取此错误名称错误:名称“chain”未从itertools导入chainValueError定义:值的长度与索引的长度不匹配获取此错误。。IndexingError:作为索引器提供的不可对齐的布尔序列(布尔序列和索引对象的索引不可用)match@hukkemaaru-什么是
print(ABC.columns)
print(XYZ.columns)
?值错误:值的长度与indexprint(ABC.columns)的长度不匹配=Int64Index([0,1],dtype='int64'))@Hukkemaru,您的数据框显示您也有第2列,但似乎只有2列。请尝试
ABC[0]。astype(str).str.extract(pat,expand=False)。map(d)
d = dict(zip(XYZ[0], XYZ[1]))
ABC[1] = [next(d.get(y) for y in x.split() if y in d) for x in ABC[0]]
print (ABC)
                      0          1        2
0         sun is rising  password1  1465465
1   micheal has arrived  password3   547854
2  goal has been scored  password2   324654