Python 如何基于列值标记多个dataframe行

Python 如何基于列值标记多个dataframe行,python,pandas,comparison,Python,Pandas,Comparison,我有一个数据帧,如下所示: ID Reviews Sorted pairwise scores A This is great 0 [(0, 1)] [0.26386763883335373] A works well 1 [] [] B can this be changed 0 [(0, 1), (0, 2)] [0.11792

我有一个数据帧,如下所示:

ID Reviews              Sorted  pairwise         scores
A   This is great         0     [(0, 1)]         [0.26386763883335373]
A   works well            1     []               []
B   can this be changed   0     [(0, 1), (0, 2)] [0.1179287227608669, 0.36815020951152794]
B   how to perform that   1     [(1, 2)]         [0.03299057711398918]
B   summarize it          2     []               []
排序将是ID中重复项的顺序。成对组合将是按ID分组的成对组合。我使用成对组合获得分数列。现在我需要创建一个标志列,如果分数>0.15,则根据成对列标记“是”。例如,当按ID分组时,值B的得分>0.15为0.36,当我们查看成对列(0,2)时,i、e 0和2行应标记为“是”

我期望的输出是:

ID Reviews              Sorted  pairwise         scores                                    Flag
A   This is great         0     [(0, 1)]         [0.26386763883335373]                      yes
A   works well            1     []               []                                         yes
B   can this be changed   0     [(0, 1), (0, 2)] [0.1179287227608669, 0.36815020951152794]  yes
B   how to perform that   1     [(1, 2)]         [0.03299057711398918]                      No
B   summarize it          2     []               []                                         yes
我试着用np.where来计算分数,但对我来说不起作用

有人能提出一个解决方案或想法吗?
提前谢谢

我们进行
分解
,然后
合并
返回

s=df.scores.explode()
s=df.set_index('ID').pairwise.explode()[(s>0.15).values].explode()
df=df.merge(s.to_frame('Sorted').reset_index().assign(flag='Yes'),how='left')
df.flag.fillna('No',inplace=True)
df
                                      scores          pairwise Sorted ID flag
0                      [0.26386763883335373]          [(0, 1)]      0  A  Yes
1                                         []                []      1  A  Yes
2  [0.1179287227608669, 0.36815020951152794]  [(0, 1), (0, 2)]      0  B  Yes
3                      [0.03299057711398918]          [(1, 2)]      1  B   No
4                                         []                []      2  B  Yes

我们先分解,然后再合并

s=df.scores.explode()
s=df.set_index('ID').pairwise.explode()[(s>0.15).values].explode()
df=df.merge(s.to_frame('Sorted').reset_index().assign(flag='Yes'),how='left')
df.flag.fillna('No',inplace=True)
df
                                      scores          pairwise Sorted ID flag
0                      [0.26386763883335373]          [(0, 1)]      0  A  Yes
1                                         []                []      1  A  Yes
2  [0.1179287227608669, 0.36815020951152794]  [(0, 1), (0, 2)]      0  B  Yes
3                      [0.03299057711398918]          [(1, 2)]      1  B   No
4                                         []                []      2  B  Yes

尝试更新我的答案~请检查更新~尝试更新我的答案~请检查更新~此编辑对我更有意义。非常感谢。我注意到的唯一一个问题是,如果分数上面有@gamyanaidu重做,只需将s>0.15改为s,这个编辑对我来说更有意义。非常感谢。我注意到的唯一一个问题是,如果分数上面有@gamyanaidu重做,只需将s>0.15改为s