Python 如何逐行比较两个数据帧?

Python 如何逐行比较两个数据帧?,python,pandas,dataframe,machine-learning,data-science,Python,Pandas,Dataframe,Machine Learning,Data Science,我有152431 X 15形状的数据帧,我想要两帧的差 如果您的数据帧存储在两个文件中,我将读取循环中每个文件的每一行,并创建一个差异列表: old_file_path = 'INSERT_FILE_PATH_OF_FILE_A' new_file_path = 'INSER_FILE_PATH_OF_FILE_B' with open(old_file_path, 'r', encoding='utf-8') as old ,open(new_file_path, 'r', encodin

我有152431 X 15形状的数据帧,我想要两帧的差



如果您的数据帧存储在两个文件中,我将读取循环中每个文件的每一行,并创建一个差异列表:

old_file_path = 'INSERT_FILE_PATH_OF_FILE_A'
new_file_path = 'INSER_FILE_PATH_OF_FILE_B'

with open(old_file_path, 'r', encoding='utf-8') as old ,open(new_file_path, 'r', encoding='utf-8') as new:
    fileone = old.readlines()
    filetwo = new.readlines()

total_of_changes=[]
for line in filetwo:
    if line not in fileone:
        total_of_changes.append(line)

基于所有4列的差异?这是您想要的吗?pd.concat([df1,df2])。删除重复项(keep=False)?您能为DF提供预期的输出吗?不,请不要这样做!尤其是在使用pandas时,有比逐行读取和比较每个文件更好的选项。有152k行,这绝对是低效的,而且是不和谐的和笨拙的。公平地说,一个更具python风格的方法也会对我有所帮助。您是否考虑了特定的功能?:)是的,Chris A在他的评论中发布了一个很好的解决方案:
pd.concat([df1,df2])。删除重复项(keep=False)
old_file_path = 'INSERT_FILE_PATH_OF_FILE_A'
new_file_path = 'INSER_FILE_PATH_OF_FILE_B'

with open(old_file_path, 'r', encoding='utf-8') as old ,open(new_file_path, 'r', encoding='utf-8') as new:
    fileone = old.readlines()
    filetwo = new.readlines()

total_of_changes=[]
for line in filetwo:
    if line not in fileone:
        total_of_changes.append(line)