Python 在数据帧中合并相同的列_Python_Pandas_Dataframe_Group By

Python 在数据帧中合并相同的列

python pandas dataframe

Python 在数据帧中合并相同的列,python,pandas,dataframe,group-by,Python,Pandas,Dataframe,Group By,我有两个数据帧第一数据帧 UserMasterId Status Count 0 1296.0 5 5 1 1316.0 5 9 2 1325.0 5 14 3 1332.0 5 5 4 1337.0 5 44 5 1342.0 5 2 6 1344.0

我有两个数据帧

第一数据帧

    UserMasterId    Status  Count
  0 1296.0               5  5
  1 1316.0               5  9
  2 1325.0               5  14
  3 1332.0               5  5
  4 1337.0               5  44
  5 1342.0               5  2
  6 1344.0               5  18

第二数据帧

    UserMasterId    Status  Count
  0 1325.0               0  2
  1 1332.0               0  1
  2 1337.0               0  1
  3 1342.0               0  3
  4 1344.0               0  1

在这里，我们在两个数据帧中都有相同的ID

当我使用Concat时

result = pd.concat([df1, df2], axis=1, sort=True)
result

result = pd.merge(df1,df2[['UserMasterId', 'Count','Status']],on='UserMasterId')
result.head()

我得到的结果是

    UserMasterId    Status  Count   UserMasterId    Status  Count
0   1296.0               5      5         1325.0       0.0    2.0
1   1316.0               5      9         1332.0       0.0    1.0
2   1325.0               5     14         1337.0       0.0    1.0
3   1332.0               5      5         1342.0       0.0    3.0
4   1337.0               5     44         1344.0       0.0    1.0
5   1342.0               5      2            NaN       NaN    NaN
6   1344.0               5     18            NaN       NaN    NaN

当我使用“合并”时

result = pd.concat([df1, df2], axis=1, sort=True)
result

result = pd.merge(df1,df2[['UserMasterId', 'Count','Status']],on='UserMasterId')
result.head()

输出是,

    UserMasterId    Status_x    Count_x Count_y Status_y
0         1325.0           5         14      2         0
1         1332.0           5          5      1         0
2         1337.0           5         44      1         0
3         1342.0           5          2      3         0
4         1344.0           5         18      1         0

这将删除df1和df2中不常见的ID

我不想删除两个数据帧都不通用的ID。我想要这样的输出

UserMasterId    Status_x    Count_x Count_y Status_y
0     1296.0           5          5      NA       NA
1     1316.0           5          9      NA       NA
2     1325.0           5         14      2         0
3     1332.0           5          5      1         0
4     1337.0           5         44      1         0
5     1342.0           5          2      3         0
6     1344.0           5         18      1         0

有人能帮我吗？

你可以尝试使用append方法：

df3 = df1.append(df2)

pd.concat

是这里的解决方案，但是您需要添加

ignore_index

参数，并且不要将

轴设置为1
，例如：
>>> df1 = pd.DataFrame({'a': [0, 1], 'b': [2, 3]})
>>> df2 = pd.DataFrame({'b': [4, 5], 'a': [5, 6]})
>>> pd.concat([df1, df2], ignore_index=True)
   a  b
0  0  2
1  1  3
2  5  4
3  6  5

将merge与外部联接一起使用应该会很好，对吗？然而，我没有测试它
result = pd.merge(df1,df2[['UserMasterId', 'Count','Status']],on='UserMasterId', how='outer')

只需使用how='outer'

result = pd.merge(df1,df2[['UserMasterId', 'Count','Status']],on='UserMasterId', how='outer')
print(result)

   UserMasterId  Status_x  Count_x  Count_y  Status_y
0        1296.0         5        5      NaN       NaN
1        1316.0         5        9      NaN       NaN
2        1325.0         5       14      2.0       0.0
3        1332.0         5        5      1.0       0.0
4        1337.0         5       44      1.0       0.0
5        1342.0         5        2      3.0       0.0
6        1344.0         5       18      1.0       0.0

df1.merge（df2，how='outer'，on='UserMasterId'）
？