Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/321.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在具有多个值的数据帧上合并_Python_Pandas - Fatal编程技术网

Python 在具有多个值的数据帧上合并

Python 在具有多个值的数据帧上合并,python,pandas,Python,Pandas,我有如下数据帧: _data_orig = [ [1, 3.2], [3, 3.9], [4, 1.2], [5, 2.2] ] _columns1 = ["ID", "GPA"] _data_new = [ [1, "Bob"], [2, "Sam"], [3, "Jane"], [3, "Sanoj"] ] _columns2 = ["ID", "Name"] df_orig = p

我有如下数据帧:

_data_orig = [
        [1, 3.2],
        [3, 3.9],
        [4, 1.2],
        [5, 2.2]
    ]
_columns1 = ["ID", "GPA"]

_data_new = [
    [1, "Bob"],
    [2, "Sam"],
    [3, "Jane"],
    [3, "Sanoj"]
]
_columns2 = ["ID", "Name"]


df_orig = pd.DataFrame(data=_data_orig, columns=_columns1)
df_new = pd.DataFrame(data=_data_new, columns=_columns2)
当我这样做时:

df_merge = pd.merge(df_orig, df_new, how='left')
我得到:

    ID  GPA Name
0   1   3.2 Bob
1   3   3.9 Jane
2   3   3.9 Sanoj
3   4   1.2 NaN
4   5   2.2 NaN
您可以看到ID:3被重复。我希望采用这种格式,以便ID:3不会从df_orig:

    ID  GPA Name    Name_1
0   1   3.2 Bob 
1   3   3.9 Jane    Sanoj
2   4   1.2 NaN 
4   5   2.2 NaN 
试试这个:

让我们创建以下帮助器DF

In [279]: x = (df_new.groupby('ID')['Name']
     ...:            .apply(';'.join)
     ...:            .str.split(';', expand=True)
     ...:            .add_prefix('Name_')
     ...:            .reset_index())
     ...:

In [280]: x
Out[280]:
   ID Name_0 Name_1
0   1    Bob   None
1   2    Sam   None
2   3   Jane  Sanoj
现在我们可以简单地将它与
df_orig
df合并

In [281]: pd.merge(df_orig, x, how='left').fillna('')
     ...:
Out[281]:
   ID  GPA Name_0 Name_1
0   1  3.2    Bob
1   3  3.9   Jane  Sanoj
2   4  1.2
3   5  2.2

考虑
pivot
关闭
groupby()。cumcount
合并

df_new['IDcount'] = "Name_" + (df_new.groupby("ID").cumcount() + 1).astype(str)
df_wide = df_new.pivot(index="ID", columns="IDcount", values="Name").reset_index()

df_merge = pd.merge(df_orig, df_wide, on='ID', how='left')

#    ID  GPA Name_1 Name_2
# 0   1  3.2    Bob   None
# 1   3  3.9   Jane  Sanoj
# 2   4  1.2    NaN    NaN
# 3   5  2.2    NaN    NaN

谢谢“非常顺利。”萨诺伊,很高兴它能帮上忙