Python 熊猫:基于一列中的相似值,使用多个数据帧中的值填充数据帧中的空列
我有一个大型数据框,它有两列,但有许多行,因此这只是一个示例:Python 熊猫:基于一列中的相似值,使用多个数据帧中的值填充数据帧中的空列,python,pandas,dataframe,merge,Python,Pandas,Dataframe,Merge,我有一个大型数据框,它有两列,但有许多行,因此这只是一个示例: df1 = {"text":["see you in five minutes.", "she is my friend.", "she goes to school in five minutes.","he is my friend.","that is right.","sky is blue.&q
df1 = {"text":["see you in five minutes.", "she is my friend.", "she goes to school in five minutes.","he is my friend.","that is right.","sky is blue.","sky is yellow."],
"goal":[" "," "," "," "," "," "," "]}
我还有另外三个不同大小的数据帧,但它们都有df1中文本列中的一些行:
df2= {"text":["see you in five minutes.", "he is my friend."],
"second":["num","friend"]}
df3 = {"text":["she goes to school in five minutes.","she is my friend.","that is right."],
"third":["num","friend","correct"]}
df4 = {"text":["sky is blue.","sky is yellow."],
"fourth":["color","color"]}
我想做的是将“第二”、“第三”和“第四”列合并到df1中,以填充df1中的空列“目标”
desired output:
df1 = {"text":["see you in five minutes.", "she is my friend.", "she goes to school in five minutes.","he is my friend.","that is right.","sky is blue.","sky is yellow."],
"goal":["num","friend","num","friend","correct","color","color"]}
我尝试对每个数据帧进行多次左合并,但输出将显示在不同的列中。是否有一种方法可以立即完成并将其添加到目标列中
感谢您使用将数据帧
df2
、df3
和df4
连接起来,创建一个映射系列m
,然后使用此映射系列以及映射df1
中的文本列:
m = pd.concat([df.set_index('text').iloc[:, 0] for df in (df2, df3, df4)])
df1['goal'] = df1['text'].map(m)
结果:
# print(df1)
text goal
0 see you in five minutes. num
1 she is my friend. friend
2 she goes to school in five minutes. num
3 he is my friend. friend
4 that is right. correct
5 sky is blue. color
6 sky is yellow. color