Python 基于匹配的列标签向数据框添加行值_Python_Python 3.x_Pandas_Dataframe

Python 基于匹配的列标签向数据框添加行值

python python-3.x pandas dataframe

Python 基于匹配的列标签向数据框添加行值,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我设法解决这个问题。我有三个数据帧，我想根据第三个数据帧中的值合并（连接？）其中的两个数据帧。以下是数据帧： df1： df2： df1和df2中的列不同，但它们的关系在df3中 df3：我希望合并df1和df2中的数据，但保留与d1中相同的列（因为b1、b2、b3用a1、a2、a3、a4和a5引用）。这里是df4，我想要的数据帧 df4：非常感谢，请使用以下方法取消PIVOTdf2：从参考表df3中删除冗余列index，并将其与df2一起删除： merged = pd.merge(df

我设法解决这个问题。我有三个数据帧，我想根据第三个数据帧中的值合并（连接？）其中的两个数据帧。以下是数据帧：

df1：

df2：

df1和df2中的列不同，但它们的关系在df3中

df3：

我希望合并df1和df2中的数据，但保留与d1中相同的列（因为b1、b2、b3用a1、a2、a3、a4和a5引用）。这里是df4，我想要的数据帧

df4：

非常感谢，请使用以下方法取消PIVOT

df2

：

从参考表

df3

中删除冗余列

index

，并将其与

df2

一起删除：

merged = pd.merge(df2_melt, df3.drop("index", axis=1), on="product2")\
    .drop("product2", axis=1)

从合并结果执行以下操作：

new_rows = pd.pivot_table(merged, index=["index", "fields"],
                          columns="product1", values="value")\
    .reset_index()

将新行添加到

df1

，对行进行排序并重置索引：

pd.concat([df1, new_rows]).sort_values("index").reset_index(drop=True)

结果：

product1    index       fields  a1      a2      a3      a4      a5
0           2018-06-01  price   1.1     2.1     3.1     4.1     5.1
1           2018-06-01  amount  15.0    25.0    35.0    45.0    55.0
2           2018-06-01  clients 1.0     1.0     2.0     2.0     3.0
3           2018-06-02  price   1.2     2.2     3.2     4.2     5.2
4           2018-06-02  amount  16.0    26.0    36.0    46.0    56.0
5           2018-06-02  clients 1.0     1.0     2.0     2.0     3.0
6           2018-06-03  price   1.3     2.3     3.3     4.3     5.3
7           2018-06-03  amount  17.0    27.0    37.0    47.0    57.0
8           2018-06-03  clients 1.0     1.0     2.0     2.0     3.0

如果重命名df2的列：

df2 = df2.rename(colunmns={'b1':'a1', 'b2':'a2', 'b3':'a3'})

然后你可以做一个简单的concat：

fields = [df1, df2]
df4 = pd.concat(fields)

您将获得所需的df4

然而，在df2中，只有a1-a3，在df4中有a1-a5列，因此df2中的行对于a4、a5将具有NaN，除非您以某种方式创建它们的列。您可以通过以下方式完成此操作：

df2['a4'] = df2['a1']

。。。etc

尝试了.join（）或.merge（）？你的代码是什么？

pd.concat([df1, new_rows]).sort_values("index").reset_index(drop=True)

product1    index       fields  a1      a2      a3      a4      a5
0           2018-06-01  price   1.1     2.1     3.1     4.1     5.1
1           2018-06-01  amount  15.0    25.0    35.0    45.0    55.0
2           2018-06-01  clients 1.0     1.0     2.0     2.0     3.0
3           2018-06-02  price   1.2     2.2     3.2     4.2     5.2
4           2018-06-02  amount  16.0    26.0    36.0    46.0    56.0
5           2018-06-02  clients 1.0     1.0     2.0     2.0     3.0
6           2018-06-03  price   1.3     2.3     3.3     4.3     5.3
7           2018-06-03  amount  17.0    27.0    37.0    47.0    57.0
8           2018-06-03  clients 1.0     1.0     2.0     2.0     3.0

df2 = df2.rename(colunmns={'b1':'a1', 'b2':'a2', 'b3':'a3'})

fields = [df1, df2]
df4 = pd.concat(fields)

df2['a4'] = df2['a1']