Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/sorting/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 合并/合并两个名称重复的数据集_Python_Python 3.x_Pandas - Fatal编程技术网

Python 合并/合并两个名称重复的数据集

Python 合并/合并两个名称重复的数据集,python,python-3.x,pandas,Python,Python 3.x,Pandas,我尝试按如下方式合并两个数据集(数据帧): D1 = pd.DataFrame({'Village':['Ampil','Ampil','Ampil','Bachey','Bachey','Center','Center','Center','Center'], 'Code':[123,324,190,453,321,786,456,234,987]}) D2 = pd.DataFrame({'Village':['Ampil','Ampil','Bachey','Bachey','Cente

我尝试按如下方式合并两个数据集(数据帧):

D1 = pd.DataFrame({'Village':['Ampil','Ampil','Ampil','Bachey','Bachey','Center','Center','Center','Center'], 'Code':[123,324,190,453,321,786,456,234,987]})

D2 = pd.DataFrame({'Village':['Ampil','Ampil','Bachey','Bachey','Center','Center'],'Lat':[11.563,13.278,12.637,11.356,12.736,13.456], 'Long':[102.234,103.432,105.673,103.539,103.873,102.983]})
我想基于村庄列合并这两个。我希望输出如下所示:

D3 = pd.DataFrame({'Village': ['Ampil','Ampil','Bachey','Bachey','Center','Center'],'Code':[123,324,453,321,786,456],'Lat':[11.563,13.278,12.637,11.356,12.736,13.456], 'Long':[102.234,103.432,105.673,103.539,103.873,102.983]})

我试过加入、合并和合并,但都不符合目的。我需要一个代码,将适用于更大的数据。如果有人能提供帮助,我非常感谢。

一种方法是首先通过
Village
为您的两个初始dfs创建一个正在运行的cumcount,然后通过
Village
count
合并:

df1['count'] = df1.groupby('Village').cumcount()
df2["count"] = df2.groupby('Village').cumcount()

print (df2.merge(df1,on=["Village","count"],how="left").drop("count",axis=1))

#
      Village     Lat     Long  Code
0   Ampil  11.563  102.234   123
1   Ampil  13.278  103.432   324
2  Bachey  12.637  105.673   453
3  Bachey  11.356  103.539   321
4  Center  12.736  103.873   786
5  Center  13.456  102.983   456

您可以执行
count=groupby('Village').cumcount()
,然后在
Village
count上合并。