Python 合并不同长度的数据帧_Python_Dataframe_Merge_Vlookup

Python 合并不同长度的数据帧

python dataframe merge

Python 合并不同长度的数据帧,python,dataframe,merge,vlookup,Python,Dataframe,Merge,Vlookup,我正在使用以下代码合并两个不同长度的数据帧： df1=pd.merge(df1, df2, on='OFFERING_ID',how='left') 合并前的行数为400 0000，合并后的行数为600000 你怎么解决这个问题谢谢问题不在于长度，而在于提供的\u ID 简而言之，提供的\u ID在第二个数据帧中不是唯一的。因此，每个提供的\u ID都会有多个匹配项，因此比原来的多行我在中做了一个示例，代码也粘贴在下面： import pandas as pd df1 = pd.Data

我正在使用以下代码合并两个不同长度的数据帧：

df1=pd.merge(df1, df2, on='OFFERING_ID',how='left')

合并前的行数为400 0000，合并后的行数为600000

你怎么解决这个问题

谢谢

问题不在于长度，而在于

提供的\u ID

简而言之，

提供的\u ID在第二个数据帧中不是唯一的。因此，每个提供的\u ID
都会有多个匹配项，因此比原来的多行
我在中做了一个示例，代码也粘贴在下面：
import pandas as pd

df1 = pd.DataFrame(
    [
        {"OFFERING_ID": 1, "another_field": "whatever"},
        {"OFFERING_ID": 2, "another_field": "whatever"},
        {"OFFERING_ID": 3, "another_field": "whatever"},
        {"OFFERING_ID": 4, "another_field": "whatever"},
    ]
)

df2 = pd.DataFrame(
    [
        {"OFFERING_ID": "1", "another_field": "whatever"},
        {"OFFERING_ID": 1, "another_field": "whatever"},
        {"OFFERING_ID": 1, "another_field": "whatever"},
    ]
)

print(df1.shape)
print(df2.shape)
print(pd.merge(df1, df2, on="OFFERING_ID", how="left").shape)

提供\u id\u dfs=[]
对于df1.providing_id.unique（）中的id：
sub_df1=df1.loc[df1.providing_ID==ID，：].reset_index（drop=True）
sub_df2=df2.loc[df2.providing_ID==ID，：].reset_index（drop=True）
concat_df=pd.concat（[sub_df1，sub_df2]，轴=1）
concat_df[“提供_ID”]=ID
提供\u id\u dfs.append（concat\u df）
df3=pd.concat（提供\u id\u dfs）.重置\u索引（drop=True）

只要每个数据帧在您的产品标识旁边只包含一列，并且所有df2.Offering\u ID.unique（）都在df1.Offering\u ID.unique（）的集合中，那么这可能会起作用。
结尾不需要有一个indicator=True
：df1=pd.merge（df1，df2，on='Offering\u ID'，how='left'，indicator=True）

？你能发布你想要的输出吗？