Python 熊猫:数据帧差分函数
我正在尝试使用熊猫解决以下问题: 数据帧1:Python 熊猫:数据帧差分函数,python,pandas,Python,Pandas,我正在尝试使用熊猫解决以下问题: 数据帧1: Apple Banana Orange Orange Banana Apple Kiwi Lime Apple Banana Apple Orange 数据帧2: Orange Banana Apple Apple Banana Orange Apple Orange Apple Kiwi Apple Apple 功能: DataFrame 1 - DataFrame 2 输出: Kiwi Lime Apple Ba
Apple Banana Orange
Orange Banana Apple
Kiwi Lime Apple
Banana Apple Orange
数据帧2:
Orange Banana Apple
Apple Banana Orange
Apple Orange Apple
Kiwi Apple Apple
功能:
DataFrame 1 - DataFrame 2
输出:
Kiwi Lime Apple
Banana Apple Orange
本质上,我处理的是多列中的分类变量,并且希望找到DataFrame 1中的行,而不是DataFrame 2中的行。我还希望保持行的顺序,如输出中所示。i、 e这不是:
Banana Apple Orange
Kiwi Lime Apple
考虑使用连接,然后删除所有生成的连接
#!/usr/bin/python
import pandas as pd
df1 = pd.DataFrame({'Categ1':['Apple', 'Orange', 'Kiwi', 'Banana'],
'Categ2':['Banana', 'Banana', 'Lime', 'Apple'],
'Categ3':['Orange', 'Apple', 'Apple', 'Orange']})
df2 = pd.DataFrame({'Categ1':['Orange', 'Apple', 'Apple', 'Kiwi'],
'Categ2':['Banana', 'Banana', 'Orange', 'Apple'],
'Categ3':['Apple', 'Orange', 'Apple', 'Apple']})
# MERGE BOTH DATA FRAMES
merged = pd.merge(df1, df2, on=['Categ1', 'Categ2', 'Categ3'])
# DROP FROM ORIGINAL DF1 ANY ITEMS IN MERGED
df1 = df1.drop(merged.index)
数据帧输出:
ORIGINAL DF1
Categ1 Categ2 Categ3
0 Apple Banana Orange
1 Orange Banana Apple
2 Kiwi Lime Apple
3 Banana Apple Orange
MERGED DF
Categ1 Categ2 Categ3
0 Apple Banana Orange
1 Orange Banana Apple
FINAL DF1
Categ1 Categ2 Categ3
2 Kiwi Lime Apple
3 Banana Apple Orange