Python 熊猫:数据帧差分函数

Python 熊猫:数据帧差分函数,python,pandas,Python,Pandas,我正在尝试使用熊猫解决以下问题: 数据帧1: Apple Banana Orange Orange Banana Apple Kiwi Lime Apple Banana Apple Orange 数据帧2: Orange Banana Apple Apple Banana Orange Apple Orange Apple Kiwi Apple Apple 功能: DataFrame 1 - DataFrame 2 输出: Kiwi Lime Apple Ba

我正在尝试使用熊猫解决以下问题:

数据帧1:

Apple  Banana Orange
Orange Banana Apple
Kiwi   Lime   Apple
Banana Apple  Orange
数据帧2:

Orange Banana Apple
Apple  Banana Orange
Apple  Orange Apple
Kiwi   Apple  Apple
功能:

DataFrame 1 - DataFrame 2
输出:

Kiwi   Lime  Apple
Banana Apple  Orange
本质上,我处理的是多列中的分类变量,并且希望找到DataFrame 1中的行,而不是DataFrame 2中的行。我还希望保持行的顺序,如输出中所示。i、 e这不是:

Banana Apple  Orange
Kiwi   Lime  Apple
考虑使用连接,然后删除所有生成的连接

#!/usr/bin/python
import pandas as pd

df1 = pd.DataFrame({'Categ1':['Apple', 'Orange', 'Kiwi', 'Banana'],
                    'Categ2':['Banana', 'Banana', 'Lime', 'Apple'],
                    'Categ3':['Orange', 'Apple', 'Apple', 'Orange']})

df2 = pd.DataFrame({'Categ1':['Orange', 'Apple', 'Apple', 'Kiwi'],
                    'Categ2':['Banana', 'Banana', 'Orange', 'Apple'],
                    'Categ3':['Apple', 'Orange', 'Apple', 'Apple']})

# MERGE BOTH DATA FRAMES   
merged = pd.merge(df1, df2, on=['Categ1', 'Categ2', 'Categ3'])

# DROP FROM ORIGINAL DF1 ANY ITEMS IN MERGED
df1 = df1.drop(merged.index)
数据帧输出:

ORIGINAL DF1
   Categ1  Categ2  Categ3
0   Apple  Banana  Orange
1  Orange  Banana   Apple
2    Kiwi    Lime   Apple
3  Banana   Apple  Orange

MERGED DF
   Categ1  Categ2  Categ3
0   Apple  Banana  Orange
1  Orange  Banana   Apple

FINAL DF1
   Categ1 Categ2  Categ3
2    Kiwi   Lime   Apple
3  Banana  Apple  Orange