Python 合并数据帧_Python_Pandas_Dataframe

Python 合并数据帧

python pandas dataframe

Python 合并数据帧,python,pandas,dataframe,Python,Pandas,Dataframe,我有几个要合并的数据帧，但问题是这些数据帧没有相同的列，我只想合并特定的行。我将展示一个示例，以便更简单： MAIN_DF我希望所有内容都合并到其中： key A B C 0001 1 0 0 0002 1 1 1 0003 0 0 1 DF_1： key A B C D 0001 1 0 0 1 0003 0 0 1 0 0004 1 1 1 1

我有几个要合并的数据帧，但问题是这些数据帧没有相同的列，我只想合并特定的行。我将展示一个示例，以便更简单：

MAIN_DF我希望所有内容都合并到其中：

key    A    B    C
0001   1    0    0
0002   1    1    1
0003   0    0    1

DF_1：

key    A    B    C   D
0001   1    0    0   1
0003   0    0    1   0
0004   1    1    1   1

key    C    D    E   F
0004   1    1    0   1
0005   0    0    1   0
0006   1    1    1   1

DF_2：

key    A    B    C   D
0001   1    0    0   1
0003   0    0    1   0
0004   1    1    1   1

key    C    D    E   F
0004   1    1    0   1
0005   0    0    1   0
0006   1    1    1   1

因此，我想将其全部合并到MAIN_DF，因此MAIN_DF将是：

key    A    B    C    D    E   F
0001   1    0    0    1    0   0
0002   1    1    1    0    0   0
0003   0    0    1    0    0   0
0004   0    0    0    1    0   1
0005   0    0    0    0    1   0
0006   0    0    0    1    1   1

查看更新了列并添加了新行

有没有可能不用长而慢的循环和if语句就用pandas来实现呢

谢谢

您可以使用concat水平连接任意数量的数据帧：

import pandas as pd
df = pd.concat([df1,df2], axis=1, verify_integrity=True)

“verify_integrity”参数检查重复项

转到此处了解更多有关

的信息您可以使用concat水平连接任意数量的数据帧：

import pandas as pd
df = pd.concat([df1,df2], axis=1, verify_integrity=True)

“verify_integrity”参数检查重复项

到这里来了解更多关于

我想你需要：

我认为你需要：

下面是使用

groupby

的方法

import pandas as pd 
import numpy as np

df1 = pd.DataFrame([[1, 0, 0],
                    [1, 1, 1],
                    [0, 0, 1]],    columns=['a', 'b', 'c'],      index=[1, 2, 3])
df2 = pd.DataFrame([[1, 0, 0, 1],
                    [0, 0, 1, 0],
                    [1, 1, 1, 1]], columns=['a', 'b', 'c', 'd'], index=[1, 3, 4])
df3 = pd.DataFrame([[1, 1, 0, 1],
                    [0, 0, 1, 0],
                    [1, 1, 1, 1]], columns=['c', 'd', 'e', 'f'], index=[4, 5, 6])

# combine the first and second df
df4 = pd.concat([df1, df2])
grouped = df4.groupby(level=0)
df5 = grouped.first()

# combine (first and second combined), with the third
df6 = pd.concat([df5, df3])
grouped = df6.groupby(level=0)
df7 = grouped.first()

# fill na values with 0
df7.fillna('0', inplace=True)

print(df)

    a   b   c   d   e   f
1   1   0   0   1   0   0
2   1   1   1   0   0   0
3   0   0   1   0   0   0
4   1   1   1   1   0   1
5   0   0   0   0   1   0
6   0   0   1   1   1   1

下面是使用

groupby

的方法

import pandas as pd 
import numpy as np

df1 = pd.DataFrame([[1, 0, 0],
                    [1, 1, 1],
                    [0, 0, 1]],    columns=['a', 'b', 'c'],      index=[1, 2, 3])
df2 = pd.DataFrame([[1, 0, 0, 1],
                    [0, 0, 1, 0],
                    [1, 1, 1, 1]], columns=['a', 'b', 'c', 'd'], index=[1, 3, 4])
df3 = pd.DataFrame([[1, 1, 0, 1],
                    [0, 0, 1, 0],
                    [1, 1, 1, 1]], columns=['c', 'd', 'e', 'f'], index=[4, 5, 6])

# combine the first and second df
df4 = pd.concat([df1, df2])
grouped = df4.groupby(level=0)
df5 = grouped.first()

# combine (first and second combined), with the third
df6 = pd.concat([df5, df3])
grouped = df6.groupby(level=0)
df7 = grouped.first()

# fill na values with 0
df7.fillna('0', inplace=True)

print(df)

    a   b   c   d   e   f
1   1   0   0   1   0   0
2   1   1   1   0   0   0
3   0   0   1   0   0   0
4   1   1   1   1   0   1
5   0   0   0   0   1   0
6   0   0   1   1   1   1

是，但请注意，我不想要重复的行，但请注意，我不想要重复的行输出为

正确的行？您的左下角3个单元格应为[[1,1,1]，[0,0,0]，[0,0,1]]从上到下。输出中带有

的行是否正确？左下角的3个单元格应从上到下读取[[1,1,1]、[0,0,0]、[0,0,1]]。