Python 如何基于列值标记多个dataframe行
我有一个数据帧,如下所示:Python 如何基于列值标记多个dataframe行,python,pandas,comparison,Python,Pandas,Comparison,我有一个数据帧,如下所示: ID Reviews Sorted pairwise scores A This is great 0 [(0, 1)] [0.26386763883335373] A works well 1 [] [] B can this be changed 0 [(0, 1), (0, 2)] [0.11792
ID Reviews Sorted pairwise scores
A This is great 0 [(0, 1)] [0.26386763883335373]
A works well 1 [] []
B can this be changed 0 [(0, 1), (0, 2)] [0.1179287227608669, 0.36815020951152794]
B how to perform that 1 [(1, 2)] [0.03299057711398918]
B summarize it 2 [] []
排序将是ID中重复项的顺序。成对组合将是按ID分组的成对组合。我使用成对组合获得分数列。现在我需要创建一个标志列,如果分数>0.15,则根据成对列标记“是”。例如,当按ID分组时,值B的得分>0.15为0.36,当我们查看成对列(0,2)时,i、e 0和2行应标记为“是”
我期望的输出是:
ID Reviews Sorted pairwise scores Flag
A This is great 0 [(0, 1)] [0.26386763883335373] yes
A works well 1 [] [] yes
B can this be changed 0 [(0, 1), (0, 2)] [0.1179287227608669, 0.36815020951152794] yes
B how to perform that 1 [(1, 2)] [0.03299057711398918] No
B summarize it 2 [] [] yes
我试着用np.where来计算分数,但对我来说不起作用
有人能提出一个解决方案或想法吗?
提前谢谢 我们进行
分解
,然后合并
返回
s=df.scores.explode()
s=df.set_index('ID').pairwise.explode()[(s>0.15).values].explode()
df=df.merge(s.to_frame('Sorted').reset_index().assign(flag='Yes'),how='left')
df.flag.fillna('No',inplace=True)
df
scores pairwise Sorted ID flag
0 [0.26386763883335373] [(0, 1)] 0 A Yes
1 [] [] 1 A Yes
2 [0.1179287227608669, 0.36815020951152794] [(0, 1), (0, 2)] 0 B Yes
3 [0.03299057711398918] [(1, 2)] 1 B No
4 [] [] 2 B Yes
我们先分解,然后再合并
s=df.scores.explode()
s=df.set_index('ID').pairwise.explode()[(s>0.15).values].explode()
df=df.merge(s.to_frame('Sorted').reset_index().assign(flag='Yes'),how='left')
df.flag.fillna('No',inplace=True)
df
scores pairwise Sorted ID flag
0 [0.26386763883335373] [(0, 1)] 0 A Yes
1 [] [] 1 A Yes
2 [0.1179287227608669, 0.36815020951152794] [(0, 1), (0, 2)] 0 B Yes
3 [0.03299057711398918] [(1, 2)] 1 B No
4 [] [] 2 B Yes
尝试更新我的答案~请检查更新~尝试更新我的答案~请检查更新~此编辑对我更有意义。非常感谢。我注意到的唯一一个问题是,如果分数上面有@gamyanaidu重做,只需将s>0.15改为s,这个编辑对我来说更有意义。非常感谢。我注意到的唯一一个问题是,如果分数上面有@gamyanaidu重做,只需将s>0.15改为s