Python 熊猫合并两个具有不同日期和列的数据框

Python 熊猫合并两个具有不同日期和列的数据框,python,pandas,dataframe,Python,Pandas,Dataframe,我需要合并两个数据帧,如下所示: 我尝试了内部,左侧连接,但得到了重复的值。我的钥匙是日期和类别 df1 date categories cost clicks impression conversion 02-11-20 categories 5 153999 12 80 2 03-11-20 categories 1 9366463 31 135

我需要合并两个数据帧,如下所示: 我尝试了
内部
左侧
连接,但得到了重复的
值。我的钥匙是日期和类别

df1
    date         categories      cost   clicks  impression  conversion
    02-11-20    categories 5    153999   12         80        2
    03-11-20    categories 1    9366463  31        135        4
    03-11-20    categories 2    2738528  21        167        2
    03-11-20    Others          4177461  19         94        1
    03-11-20    categories 3    1747084   4         21        2
    04-11-20    categories 4    5812003  35        220        1
    04-11-20    categories 5    8490241  41        225        2    



df2
  date          categories       sales      deal
  02-11-20      categories 5     117810       1
  04-11-20      categories 4    1487500       3
  04-11-20      categories 6     299999       1
  04-11-20      Others           79106        1



desired output 
  date      categories      cost      clicks    impression  conversion  sales deal
  02-11-20  categories 5    153999      12          80          2      117810   1
  03-11-20  categories 1    9366463     31         135          4       na     na
  03-11-20  categories 2    2738528     21         167          2       na     na
  03-11-20  Others          4177461     19          94          1       na     na
  03-11-20  categories 3    1747084      4          21          2       na     na
  04-11-20  categories 4    5812003     35         220          1     1487500   3
  04-11-20  categories 5    8490241     41         225          2       na     na
  04-11-20  Others            na        na          na          na      79106   1
  04-11-20  categories 6      na        na          na          na     299999   1

谢谢

您应该使用
外部
联接,并指定合并应基于的两列-注意,您应该在
列表中提供列

outer
连接使用两个帧中的键,并为两个数据帧中缺少的行插入
NaN的

new = df1.merge(df2, on=['date','categories'], how='outer')
其中打印:

        date    categories       cost  ...  conversion      sales  deal
0 2020-02-11  categories 5   153999.0  ...         2.0   117810.0   1.0
1 2020-03-11  categories 1  9366463.0  ...         4.0        NaN   NaN
2 2020-03-11  categories 2  2738528.0  ...         2.0        NaN   NaN
3 2020-03-11        Others  4177461.0  ...         1.0        NaN   NaN
4 2020-03-11  categories 3  1747084.0  ...         2.0        NaN   NaN
5 2020-04-11  categories 4  5812003.0  ...         1.0  1487500.0   3.0
6 2020-04-11  categories 5  8490241.0  ...         2.0        NaN   NaN
7 2020-04-11  categories 6        NaN  ...         NaN   299999.0   1.0
8 2020-04-11        Others        NaN  ...         NaN    79106.0   1.0

使用
pd。通过传递密钥合并
,并使用
how=“outer”

请看一个例子

import pandas as pd
pd.merge(left, right, on=["key1","key2"], how="outer")