Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/319.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将另一个数据帧合并到现有行_Python_Pandas_Dataframe - Fatal编程技术网

Python 将另一个数据帧合并到现有行

Python 将另一个数据帧合并到现有行,python,pandas,dataframe,Python,Pandas,Dataframe,我有两个数据帧df和subs,如下所示: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]}) scode sname sub1 sub2 0 11 aa London NaN

我有两个数据帧
df
subs
,如下所示:

df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]})

   scode sname  sub1    sub2
0   11   aa     London  NaN
1   22   bb     NaN     NaN
2   33   cc     Delhi   Sydney
3   44   dd     NaN     NaN

subs = {0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]}

    0   1               2
0   22  Milford Sound   Oslo
1   44  Queenstown      NaN
如何合并2个数据帧并最终得到以下结果:

    scode   sname   sub1            sub2
0   11      aa      London          NaN
1   22      bb      Milford Sound   Oslo
2   33      cc      Delhi           Sydney
3   44      dd      Queenstown      NaN

首先,让我们让您的列名匹配:

newSub = sub.rename(columns={0:'scode', 1:'sub1', 2:'sub2'})
接下来,dataframe的
update
方法根据源行和目标行之间的公共索引执行您想要的操作。那么,让我们将索引设置为scode:

indexedDF     = df.set_index('scode')
indexedNewSub = newSub.set_index('scode')
最后,使用
indexedDF
的更新方法就地更新:

indexedDF.update(indexedNewSub)

indexedDF
现在应该根据请求合并
subs

熊猫将自动对齐索引/列,只要确保设置了正确的索引,假设
scode
是您要合并内容的方式:

In [5]: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydne
    ...: y", np.nan]})
    ...:

In [6]: df.set_index('scode',inplace=True)

In [7]: subs = pd.DataFrame({0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]})
    ...:

In [8]: subs.set_index(0, inplace=True)

In [9]: subs.columns=['sub1','sub2']
给你一些东西,比如:

In [10]: df
Out[10]:
      sname    sub1    sub2
scode
11       aa  London     NaN
22       bb     NaN     NaN
33       cc   Delhi  Sydney
44       dd     NaN     NaN

In [11]: subs
Out[11]:
             sub1  sub2
0
22  Milford Sound  Oslo
44     Queenstown   NaN
现在,只需执行普通赋值,选择适当的列/索引:

In [12]: df.loc[subs.index.values,['sub1', 'sub2']] = subs

In [13]: df
Out[13]:
      sname           sub1    sub2
scode
11       aa         London     NaN
22       bb  Milford Sound    Oslo
33       cc          Delhi  Sydney
44       dd     Queenstown     NaN
您始终可以重置以前使用的索引:

In [14]: df.reset_index(inplace=True)

In [15]: df
Out[15]:
   scode sname           sub1    sub2
0     11    aa         London     NaN
1     22    bb  Milford Sound    Oslo
2     33    cc          Delhi  Sydney
3     44    dd     Queenstown     NaN