Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 基于特定条件合并数据帧_Python 3.x_Pandas_Dataframe - Fatal编程技术网

Python 3.x 基于特定条件合并数据帧

Python 3.x 基于特定条件合并数据帧,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,我有一个df,如下所示 df1: df2: df3: 关于数据的解释 ID is the primary key of df1. ID is the primary key of df2. df3 does not have any primary key. 从上面,我想准备下面的dfs 1. IDs which are in df1 and df2. Expected output1: ID Job Salary 1 A 100 2 B 20

我有一个df,如下所示

df1:

df2:

df3:

关于数据的解释

ID is the primary key of df1.

ID is the primary key of df2.

df3 does not have any primary key.
从上面,我想准备下面的dfs

1. IDs which are in df1 and df2.

Expected output1:

ID   Job   Salary   
1    A     100
2    B     200
4    C     150
8    B     150
  • 存在于df1中而不存在于df2中的ID
  • 输出2:

    ID   Job   Salary
    3    B     20
    5    A     500
    6    A     600
    7    A     200
    
  • df1和df3中存在的ID
  • 输出3:

    ID   Job   Salary
    1    A     100
    2    B     200
    3    B     20
    4    C     150
    
    4. IDs which are there in df1 and not in df3.
    
    产出4:

    ID   Job   Salary
    5    A     500
    6    A     600
    7    A     200
    8    B     150
    

    可以使用两个遮罩执行此操作:

    mask1=df1.ID.isin(df2.ID)
    mask2=df1.ID.isin(df3.ID)
    
    然后,您的四个帧将是:

    df1[mask1]
    身份证工作工资
    0 1 A 100
    12B200
    3 4 C 150
    7 8 B 150
    
    df1[~mask1]
    身份证工作工资
    23B20
    45A 500
    56A600
    67A200
    
    df1[mask2]
    身份证工作工资
    0 1 A 100
    12B200
    23B20
    3 4 C 150
    
    df1[~mask2]
    身份证工作工资
    45A 500
    56A600
    67A200
    7 8 B 150
    
    实际上,您期望的结果不是任何合并,而是 选择,取决于df1.ID是否在ID列中 第二个数据帧的

    要获得预期结果,请运行以下命令:

    result_1 = df1[df1.ID.isin(df2.ID)]
    result_2 = df1[~df1.ID.isin(df2.ID)]
    result_3 = df1[df1.ID.isin(df3.ID)]
    result_4 = df1[~df1.ID.isin(df3.ID)]
    

    非常感谢,我愿意接受所有的回答。非常感谢。我愿意接受所有的答案。不幸的是,没有这样的选择
    ID   Job   Salary
    1    A     100
    2    B     200
    3    B     20
    4    C     150
    
    4. IDs which are there in df1 and not in df3.
    
    ID   Job   Salary
    5    A     500
    6    A     600
    7    A     200
    8    B     150
    
    result_1 = df1[df1.ID.isin(df2.ID)]
    result_2 = df1[~df1.ID.isin(df2.ID)]
    result_3 = df1[df1.ID.isin(df3.ID)]
    result_4 = df1[~df1.ID.isin(df3.ID)]
    
    >>> # 1. IDs which are in df1 and df2.
    >>> df1[df1['ID'].isin(df2['ID'])]
       ID Job  Salary
    0   1   A     100
    1   2   B     200
    3   4   C     150
    7   8   B     150
    
    >>> # 2. IDs which are there in df1 and not in df2    
    >>> df1[~df1['ID'].isin(df2['ID'])]
       ID Job  Salary
    2   3   B      20
    4   5   A     500
    5   6   A     600
    6   7   A     200
    
    >>> # 3. IDs which are there in df1 and df3
    >>> df1[df1['ID'].isin(df3['ID'])]
       ID Job  Salary
    0   1   A     100
    1   2   B     200
    2   3   B      20
    3   4   C     150
    
    >>> # 4. IDs which are there in df1 and not in df3.
    >>> df1[~df1['ID'].isin(df3['ID'])]
       ID Job  Salary
    4   5   A     500
    5   6   A     600
    6   7   A     200
    7   8   B     150