Python 如何在df中的行之间的相等单元格值中删除或填充0？_Python_Pandas

Python 如何在df中的行之间的相等单元格值中删除或填充0？

python pandas

Python 如何在df中的行之间的相等单元格值中删除或填充0？,python,pandas,Python,Pandas,我有两个df，我正在比较它们，并将数据保存到一个xlsx（稍后将需要json）文件中。但经过比较，这两条线仍然存在，很难看到变化如何使用熊猫删除/填充两行中的0个相等值？我很高兴在这个df中有77列使用这段代码，我将合并两个df并删除重复的行 import pandas as pd df4 = pd.read_excel("output 24.07.2020.xlsx", sheet_name="sheet1") df5 = pd.read_excel

我有两个df，我正在比较它们，并将数据保存到一个xlsx（稍后将需要json）文件中。但经过比较，这两条线仍然存在，很难看到变化

如何使用熊猫删除/填充两行中的0个相等值？我很高兴在这个df中有77列

使用这段代码，我将合并两个df并删除重复的行

import pandas as pd
df4 = pd.read_excel("output 24.07.2020.xlsx", sheet_name="sheet1")
df5 = pd.read_excel("output 25.07.2020.xlsx", sheet_name="sheet1")

df_diff = pd.concat([df4, df5], keys=["s1", "s2"]).drop_duplicates(keep=False)
df_diff.sort_values("tnom", inplace=True)

df_diff.to_excel("different2.xlsx", "sheet1", index=True)

谢谢你

合并df4和df5后的Df示例：

df = pd.DataFrame({
    'ID':['01','01','33','33','44','44'],
    'user': ['Bob', 'Bob', 'Jane', 'Jane', 'Alice', 'Anna'],
    'income': [40000, 40000, 80000, 80000, 77777, 77777],
    'balance':[333, 222, 444, 444, 444, 444],
    'plus':[123,123,321,311,200,200],
    'minus':[15,15,61,61,77,77]})

>>> df
   ID   user  income  balance  plus  minus
0  01    Bob   40000      333   123     15
1  01    Bob   40000      222   123     15
2  33   Jane   80000      444   321     61
3  33   Jane   80000      444   311     61
4  44  Alice   77777      444   200     77
5  44   Anna   77777      444   200     77

填充0个相等瓦楞后需要DF：

df = pd.DataFrame({
    'ID':['01','01','33','33','44','44'],
    'user': ['0', '0', '0', '0', 'Alice', 'Anna'],
    'income': [0, 0, 0, 0, 0, 0],
    'balance':[333, 222, 0, 0, 0, 0],
    'plus':[0,0,321,311,0,0],
    'minus':[0,0,0,0,0,0]})

>>> df
    ID   user  income  balance  plus  minus
 0  01      0       0      333     0      0
 1  01      0       0      222     0      0
 2  33      0       0        0   321      0
 3  33      0       0        0   311      0
 4  44  Alice       0        0     0      0
 5  44   Anna       0        0     0      0

使用：

df = df.set_index('ID')
mask = (
    df.groupby(level=0).transform('count').gt(1) &  
    df.groupby(level=0).transform('nunique').eq(1)
)
df = df.where(~mask, 0).reset_index()

详细信息：

df = df.set_index('ID')
mask = (
    df.groupby(level=0).transform('count').gt(1) &  
    df.groupby(level=0).transform('nunique').eq(1)
)
df = df.where(~mask, 0).reset_index()

在

level=0

和

transform

上使用and并使用and创建布尔

掩码
print(mask)
     user  income  balance   plus  minus
ID                                      
01   True    True    False   True   True
01   True    True    False   True   True
33   True    True     True  False   True
33   True    True     True  False   True
44  False    True     True   True   True
44  False    True     True   True   True

print(df)

   ID   user  income  balance  plus  minus
0  01      0       0      333     0      0
1  01      0       0      222     0      0
2  33      0       0        0   321      0
3  33      0       0        0   311      0
4  44  Alice       0        0     0      0
5  44   Anna       0        0     0      0

用于基于此掩码将数据帧中的值替换为0

print(mask)
     user  income  balance   plus  minus
ID                                      
01   True    True    False   True   True
01   True    True    False   True   True
33   True    True     True  False   True
33   True    True     True  False   True
44  False    True     True   True   True
44  False    True     True   True   True

print(df)

   ID   user  income  balance  plus  minus
0  01      0       0      333     0      0
1  01      0       0      222     0      0
2  33      0       0        0   321      0
3  33      0       0        0   311      0
4  44  Alice       0        0     0      0
5  44   Anna       0        0     0      0

Shubham Sharma，非常感谢！一个问题。如果我有在第一个df 3000行，在第二个df 3005行。这5个新行将在我的新concat df中。使用您的优秀解决方案后，我将得到0，因为pandas无法创建布尔掩码。我怎么能避免它，并看到这5行在新的df？？非常感谢。让我们讨论一下。比如在这个例子中：df=pd.DataFrame（{'ID'：['01'，'01'，'33'，'44'，'777']，'user'：['Bob'，'Bob'，'Jane'，'Jane'，'Alice'，'Anna'，'NewOPne']，'income'：[40000,40000,80000,77777,777,865]，'balance'：[3332224444444,444,855]，'plus'：[12312332131120070]，“减”：[15,15,61,61,77,77,99]）