Python 比较panda中两个数据帧中的两列,如果它们相似,则获取另一列的值
我有两个这样的数据帧Python 比较panda中两个数据帧中的两列,如果它们相似,则获取另一列的值,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有两个这样的数据帧 df1 Entry Sequence 0 A0A024QZ18 MSGLEMADHMMAMNHGRFPDGTNGLHHHPAHRMGMGQFPSPHHHQQ 1 A0A024QZ42 MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPf 2 A0A024QZB8 MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQD 3 A0
df1
Entry Sequence
0 A0A024QZ18 MSGLEMADHMMAMNHGRFPDGTNGLHHHPAHRMGMGQFPSPHHHQQ
1 A0A024QZ42 MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPf
2 A0A024QZB8 MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQD
3 A0A024QZP7 MARFGDEMPARYGGGGSGAAAGVVVGSGGGRGAGGSRQGGQPGAQR
4 A0A024QZX5 MRPDRAEAPGPPAMAAGGPGAGSAAPVSSTSSLPLAALNMRVRRRL
5 A0A024QZ33 MNSPGGRGKKKGSGGASNPVPPRPPPPCLAPAPPAAGPAPPPESPH
df2
Seq_id number
0 A0A024QZ18 67
1 A0A024QZ33 45
2 A0A024QZ42 252
3 A0A024QZB8 35
4 A0A024QZP7 34
5 A0A024QZX5 54
我想检查数据帧df1中的哪个条目存在于df2中的Se Seq_id中,如果存在,我想将df1中的序列作为类似id的df2 InFlot中的新列打印。如果不存在,则打印“nan”
Example answer:
Seq_id number Sequence
0 A0A024QZ18 67 MSGLEMADHMMAMNHGRFPDGTNGLHHHPAHRMGMGQFPSPHHHQQ
1 A0A024QZ33 45 MNSPGGRGKKKGSGGASNPVPPRPPP
2 A0A024QZ42 252 MAALSGGGGGGAEPGQALFNGDMEPEAG
3 A0A024QZB8 35 MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQD...
4 A0A024QZP7 34 MARFGDEMPARYGGGGSGAAAGVVVGSGG
5 A0A024QZX5 54 MRPDRAEAPGPPAMAAGGPGAGSAAPVSS
我试着看看他们是否在下面的专栏里
df2.序列id.isin(序列条目)
但我不知道如何打印另一列,如果它们相似,如果它们不相似,则给出nan。我认为,简单的左连接将满足您的要求
df1.merge(df2, how='left', left_on='Entry', right_on='Seq_id')
这将为您提供输出
Entry Sequence Seq_id number
A0A024QZ18 MSGLEMADHMMAMNHGRFPDGTNGLHHHPAHRMGMGQFPSPHHHQQ A0A024QZ18 67
A0A024QZ42 MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPf A0A024QZ42 252
A0A024QZB8 MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQD A0A024QZB8 35
A0A024QZP7 MARFGDEMPARYGGGGSGAAAGVVVGSGGGRGAGGSRQGGQPGAQR A0A024QZP7 34
A0A024QZX5 MRPDRAEAPGPPAMAAGGPGAGSAAPVSSTSSLPLAALNMRVRRRL A0A024QZX5 54
A0A024QZ33 MNSPGGRGKKKGSGGASNPVPPRPPPPCLAPAPPAAGPAPPPESPH A0A024QZ33 45
我认为,简单的左连接将满足您的要求
df1.merge(df2, how='left', left_on='Entry', right_on='Seq_id')
这将为您提供输出
Entry Sequence Seq_id number
A0A024QZ18 MSGLEMADHMMAMNHGRFPDGTNGLHHHPAHRMGMGQFPSPHHHQQ A0A024QZ18 67
A0A024QZ42 MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPf A0A024QZ42 252
A0A024QZB8 MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQD A0A024QZB8 35
A0A024QZP7 MARFGDEMPARYGGGGSGAAAGVVVGSGGGRGAGGSRQGGQPGAQR A0A024QZP7 34
A0A024QZX5 MRPDRAEAPGPPAMAAGGPGAGSAAPVSSTSSLPLAALNMRVRRRL A0A024QZX5 54
A0A024QZ33 MNSPGGRGKKKGSGGASNPVPPRPPPPCLAPAPPAAGPAPPPESPH A0A024QZ33 45
当没有比赛时会发生什么?@Luiggi只想让它变成“nan”当没有比赛时会发生什么?@Luiggi只想让它变成“nan”谢谢你的帮助,这太棒了。很抱歉,我的数据帧很大,我在这里给出了一点,将有不匹配的数据帧。我想用“楠”来形容那些。如果有不匹配的行,它将自动填充为
NaN
。感谢您的帮助,这非常好。很抱歉,我的数据帧很大,我在这里给出了一点,将有不匹配的数据帧。我想用“楠”来形容那些。如果有不匹配的行,它将自动填充为NaN
。