Python 2.7 根据上一行替换值

Python 2.7 根据上一行替换值,python-2.7,pandas,replace,duplicates,cumsum,Python 2.7,Pandas,Replace,Duplicates,Cumsum,我对熊猫很陌生,希望你们能为我解决这个问题提供帮助。我得到了以下数据帧: df = pd.DataFrame({'A' : ["me","you","you","me","me","me","me"], 'B' : ["Y","X","X","X","X","X","Z"], 'C' : ["1","2","3","4","5","6","7"] }) 我需要根据A列和B列中的行值对其进行转换。逻辑应该是

我对熊猫很陌生,希望你们能为我解决这个问题提供帮助。我得到了以下数据帧:

df = pd.DataFrame({'A' : ["me","you","you","me","me","me","me"],
                'B' : ["Y","X","X","X","X","X","Z"],
               'C' : ["1","2","3","4","5","6","7"]
              })
我需要根据A列和B列中的行值对其进行转换。逻辑应该是,只要A列和B列中的值在连续行中相同,就应该保持此序列中的第一行,但后面的行应该在B列中设置“A”

例如:A列和B列中的值在第1行和第2行中相同。B列第2行中的值应替换为A。这是我的预期输出:

df2= pd.DataFrame({'A' : ["me","you","you","me","me","me","me"],
                'B' : ["Y","X","A","X","A","A","Z"],
               'C' : ["1","2","3","4","5","6","7"]})

您可以先对列
A
B
求和:

a = df.A + df.B
然后与转换后的版本进行比较:

print (a != a.shift())
0     True
1     True
2    False
3     True
4    False
5    False
6     True
dtype: bool
通过
cumsum
创建唯一组:

print ((a != a.shift()).cumsum())
0    1
1    2
2    2
3    3
4    3
5    3
6    4
dtype: int32
获取值重复的布尔掩码:

print ((a != a.shift()).cumsum().duplicated())
0    False
1    False
2     True
3    False
4     True
5     True
6    False
dtype: bool
True
值替换为
A
的解决方案:

df.loc[(a != a.shift()).cumsum().duplicated(), 'B'] = 'A'
print (df)
     A  B  C
0   me  Y  1
1  you  X  2
2  you  A  3
3   me  X  4
4   me  A  5
5   me  A  6
6   me  Z  7


非常感谢,太好了。
df.B = df.B.mask((a != a.shift()).cumsum().duplicated(), 'A')
print (df)
     A  B  C
0   me  Y  1
1  you  X  2
2  you  A  3
3   me  X  4
4   me  A  5
5   me  A  6
6   me  Z  7

print (df2.equals(df))
True