Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Dataframe:如何比较一行中两列中的值与后续行中相同列中的值?_Python_Pandas - Fatal编程技术网

Python Dataframe:如何比较一行中两列中的值与后续行中相同列中的值?

Python Dataframe:如何比较一行中两列中的值与后续行中相同列中的值?,python,pandas,Python,Pandas,假设我有一个这样的数据帧 Fruit Color Weight apple red 50 apple red 75 apple green 45 orange orange 80 orange orange 90 orange red 90 我想根据x行的水果和颜色等于x+1行的水果和颜色的事实,添加一列True或False,如下所示: Fruit Color Weight Validity apple red 50 True

假设我有一个这样的数据帧

Fruit  Color  Weight
apple   red    50
apple   red    75
apple  green   45
orange orange  80
orange orange  90
orange  red    90
我想根据x行的水果和颜色等于x+1行的水果和颜色的事实,添加一列True或False,如下所示:

Fruit  Color  Weight Validity
apple   red    50      True
apple   red    75      False
apple  green   45      False
orange orange  80      True
orange orange  90      False
orange  red    90      False
我尝试了以下方法,但我猜有一些错误,我得到了错误的结果:

g['Validity'] = (g[['Fruit', 'Color']] == g[['Fruit', 'Color']].shift()).any(axis=1) 


关于移位比较,您的想法是正确的,但是您需要向后移位,以便将当前行与下一行进行比较。最后,使用
all
条件强制所有列在一行中相等:

df['Validity'] = df[['Fruit', 'Color']].eq(df[['Fruit', 'Color']].shift(-1)).all(axis=1)

df
    Fruit   Color  Weight  Validity
0   apple     red      50      True
1   apple     red      75     False
2   apple   green      45     False
3  orange  orange      80      True
4  orange  orange      90     False
5  orange     red      90     False
另一种选择-

subset_df = df[['Fruit','Color']].apply(''.join, axis=1)
df['Validity'] = np.where(subset_df == subset_df.shift(-1), True,False)

与其他答案类似:

df['Validity']=(df[['Fruit', 'Color']]==pd.concat([df['Fruit'].shift(-1), df['Color'].shift(-1)], axis=1)).all(axis=1)

>>> print(df)
       Fruit   Color  Weight  Validity
0   apple     red      50      True
1   apple     red      75     False
2   apple   green      45     False
3  orange  orange      80      True
4  orange  orange      90     False
5  orange     red      90     False