Python 如何从pandas diff获取细胞位置？_Python_Excel_Python 3.x_Pandas_Openpyxl

Python 如何从pandas diff获取细胞位置？

python excel python-3.x pandas

Python 如何从pandas diff获取细胞位置？,python,excel,python-3.x,pandas,openpyxl,Python,Excel,Python 3.x,Pandas,Openpyxl,无论哪里有差异，我都希望将这些单元格位置存储在列表中。它的格式必须是'A1'（不是[1,1]之类的格式），这样我就可以通过以下方式传递它： df1 = pd.read_excel(mxln) # Loads master xlsx for comparison df2 = pd.read_excel(sfcn) # Loads student xlsx for comparison difference = df2[df2 != df1] # Scans for differences

无论哪里有差异，我都希望将这些单元格位置存储在列表中。它的格式必须是'A1'（不是[1,1]之类的格式），这样我就可以通过以下方式传递它：

df1 = pd.read_excel(mxln)  # Loads master xlsx for comparison
df2 = pd.read_excel(sfcn)  # Loads student xlsx for comparison
difference = df2[df2 != df1]  # Scans for differences

我看过这样的解决方案，但我无法让它工作/不理解它。例如，以下操作不起作用：

redFill = PatternFill(start_color='FFEE1111', end_color='FFEE1111', fill_type='solid')
lsws['A1'].fill = redFill
lsfh.save(sfcn)

要从两个单元格中获取差异单元格作为excel坐标，可以执行以下操作：

代码：

def highlight_cells():
    df1 = pd.read_excel(mxln)  # Loads master xlsx for comparison
    df2 = pd.read_excel(sfcn)  # Loads student xlsx for comparison
    difference = df2[df2 != df1]  # Scans for differences
    return ['background-color: yellow']

df2.style.apply(highlight_cells)

import pandas as pd
df1 = pd.read_excel('test.xlsx')
print(df1)

df2 = df.copy()
df2.C['R2'] = 1
print(df2)

print(diff_cell_indices(df1, df2))

测试代码：

def diff_cell_indices(dataframe1, dataframe2):
    from openpyxl.utils import get_column_letter as column_letter

    x_ofs = dataframe1.columns.nlevels + 1
    y_ofs = dataframe1.index.nlevels + 1
    return [column_letter(x + x_ofs) + str(y + y_ofs) for
            y, x in zip(*np.where(dataframe1 != dataframe2))]

结果：

def highlight_cells():
    df1 = pd.read_excel(mxln)  # Loads master xlsx for comparison
    df2 = pd.read_excel(sfcn)  # Loads student xlsx for comparison
    difference = df2[df2 != df1]  # Scans for differences
    return ['background-color: yellow']

df2.style.apply(highlight_cells)

import pandas as pd
df1 = pd.read_excel('test.xlsx')
print(df1)

df2 = df.copy()
df2.C['R2'] = 1
print(df2)

print(diff_cell_indices(df1, df2))

@查理·云雀，谢谢你的编辑。作为参考，我复制并粘贴了我不明白“df2.C['R2']=1”在做什么。此外，当我运行函数时，我会得到一个看似随机的单元格坐标的巨大列表，而不是不同的坐标。这只是迫使两个帧有差异，以便测试代码可以显示差异。它的字面意思是将C列第R2行设置为1。那是熊猫。我必须看到你的实际数据，以帮助它不显示正确的差异。建议先尝试我的测试，然后从那里构建。它应该是df2=df1.copy（）吗？如果我只使用df.copy（），它会说名称df未定义。如果我将其设为df1.copy（），则表示DataFrame对象没有属性“C”

是测试数据的列名，就像

R2

是行名一样。如果你想有效地利用大熊猫，你可能需要对它进行一些研究。