Python 将另一个数据帧合并到现有行
我有两个数据帧Python 将另一个数据帧合并到现有行,python,pandas,dataframe,Python,Pandas,Dataframe,我有两个数据帧df和subs,如下所示: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]}) scode sname sub1 sub2 0 11 aa London NaN
df
和subs
,如下所示:
df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]})
scode sname sub1 sub2
0 11 aa London NaN
1 22 bb NaN NaN
2 33 cc Delhi Sydney
3 44 dd NaN NaN
subs = {0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]}
0 1 2
0 22 Milford Sound Oslo
1 44 Queenstown NaN
如何合并2个数据帧并最终得到以下结果:
scode sname sub1 sub2
0 11 aa London NaN
1 22 bb Milford Sound Oslo
2 33 cc Delhi Sydney
3 44 dd Queenstown NaN
首先,让我们让您的列名匹配:
newSub = sub.rename(columns={0:'scode', 1:'sub1', 2:'sub2'})
接下来,dataframe的update
方法根据源行和目标行之间的公共索引执行您想要的操作。那么,让我们将索引设置为scode:
indexedDF = df.set_index('scode')
indexedNewSub = newSub.set_index('scode')
最后,使用indexedDF
的更新方法就地更新:
indexedDF.update(indexedNewSub)
indexedDF
现在应该根据请求合并subs
。熊猫将自动对齐索引/列,只要确保设置了正确的索引,假设scode
是您要合并内容的方式:
In [5]: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydne
...: y", np.nan]})
...:
In [6]: df.set_index('scode',inplace=True)
In [7]: subs = pd.DataFrame({0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]})
...:
In [8]: subs.set_index(0, inplace=True)
In [9]: subs.columns=['sub1','sub2']
给你一些东西,比如:
In [10]: df
Out[10]:
sname sub1 sub2
scode
11 aa London NaN
22 bb NaN NaN
33 cc Delhi Sydney
44 dd NaN NaN
In [11]: subs
Out[11]:
sub1 sub2
0
22 Milford Sound Oslo
44 Queenstown NaN
现在,只需执行普通赋值,选择适当的列/索引:
In [12]: df.loc[subs.index.values,['sub1', 'sub2']] = subs
In [13]: df
Out[13]:
sname sub1 sub2
scode
11 aa London NaN
22 bb Milford Sound Oslo
33 cc Delhi Sydney
44 dd Queenstown NaN
您始终可以重置以前使用的索引:
In [14]: df.reset_index(inplace=True)
In [15]: df
Out[15]:
scode sname sub1 sub2
0 11 aa London NaN
1 22 bb Milford Sound Oslo
2 33 cc Delhi Sydney
3 44 dd Queenstown NaN