Python 3.x 基于特定条件合并数据帧
我有一个df,如下所示 df1: df2: df3: 关于数据的解释Python 3.x 基于特定条件合并数据帧,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,我有一个df,如下所示 df1: df2: df3: 关于数据的解释 ID is the primary key of df1. ID is the primary key of df2. df3 does not have any primary key. 从上面,我想准备下面的dfs 1. IDs which are in df1 and df2. Expected output1: ID Job Salary 1 A 100 2 B 20
ID is the primary key of df1.
ID is the primary key of df2.
df3 does not have any primary key.
从上面,我想准备下面的dfs
1. IDs which are in df1 and df2.
Expected output1:
ID Job Salary
1 A 100
2 B 200
4 C 150
8 B 150
ID Job Salary
3 B 20
5 A 500
6 A 600
7 A 200
ID Job Salary
1 A 100
2 B 200
3 B 20
4 C 150
4. IDs which are there in df1 and not in df3.
产出4:
ID Job Salary
5 A 500
6 A 600
7 A 200
8 B 150
可以使用两个遮罩执行此操作:
mask1=df1.ID.isin(df2.ID)
mask2=df1.ID.isin(df3.ID)
然后,您的四个帧将是:
df1[mask1]
身份证工作工资
0 1 A 100
12B200
3 4 C 150
7 8 B 150
df1[~mask1]
身份证工作工资
23B20
45A 500
56A600
67A200
df1[mask2]
身份证工作工资
0 1 A 100
12B200
23B20
3 4 C 150
df1[~mask2]
身份证工作工资
45A 500
56A600
67A200
7 8 B 150
实际上,您期望的结果不是任何合并,而是
选择,取决于df1.ID是否在ID列中
第二个数据帧的
要获得预期结果,请运行以下命令:
result_1 = df1[df1.ID.isin(df2.ID)]
result_2 = df1[~df1.ID.isin(df2.ID)]
result_3 = df1[df1.ID.isin(df3.ID)]
result_4 = df1[~df1.ID.isin(df3.ID)]
非常感谢,我愿意接受所有的回答。非常感谢。我愿意接受所有的答案。不幸的是,没有这样的选择
ID Job Salary
1 A 100
2 B 200
3 B 20
4 C 150
4. IDs which are there in df1 and not in df3.
ID Job Salary
5 A 500
6 A 600
7 A 200
8 B 150
result_1 = df1[df1.ID.isin(df2.ID)]
result_2 = df1[~df1.ID.isin(df2.ID)]
result_3 = df1[df1.ID.isin(df3.ID)]
result_4 = df1[~df1.ID.isin(df3.ID)]
>>> # 1. IDs which are in df1 and df2.
>>> df1[df1['ID'].isin(df2['ID'])]
ID Job Salary
0 1 A 100
1 2 B 200
3 4 C 150
7 8 B 150
>>> # 2. IDs which are there in df1 and not in df2
>>> df1[~df1['ID'].isin(df2['ID'])]
ID Job Salary
2 3 B 20
4 5 A 500
5 6 A 600
6 7 A 200
>>> # 3. IDs which are there in df1 and df3
>>> df1[df1['ID'].isin(df3['ID'])]
ID Job Salary
0 1 A 100
1 2 B 200
2 3 B 20
3 4 C 150
>>> # 4. IDs which are there in df1 and not in df3.
>>> df1[~df1['ID'].isin(df3['ID'])]
ID Job Salary
4 5 A 500
5 6 A 600
6 7 A 200
7 8 B 150