Python 基于另一个表的列更新表信息_Python_Pandas_Dataframe_Datatable

Python 基于另一个表的列更新表信息

python pandas dataframe

Python 基于另一个表的列更新表信息,python,pandas,dataframe,datatable,Python,Pandas,Dataframe,Datatable,我是python新手，有两个数据帧，df1包含关于所有学生及其组和分数的信息，df2包含关于少数学生在更改组和分数时的更新信息。如何根据df2（组和分数）的值更新df1中的信息 df1 结果 df:3 从df2更新df1的我的代码 dfupdated = df1.merge(df2, how='left', on=['student No'], suffixes=('', '_new')) dfupdated['group'] = np.where(pd.notnull(dfupdated['g

我是python新手，有两个数据帧，df1包含关于所有学生及其组和分数的信息，df2包含关于少数学生在更改组和分数时的更新信息。如何根据df2（组和分数）的值更新df1中的信息

df1

结果

df:3

从df2更新df1的我的代码

dfupdated = df1.merge(df2, how='left', on=['student No'], suffixes=('', '_new'))
dfupdated['group'] = np.where(pd.notnull(dfupdated['group_new']), dfupdated['group_new'],
                                         dfupdated['group'])
dfupdated['score'] = np.where(pd.notnull(dfupdated['score_new']), dfupdated['score_new'],
                                         dfupdated['score'])
dfupdated.drop(['group_new', 'score_new'],axis=1, inplace=True)
dfupdated.reset_index(drop=True, inplace=True)

但我面临以下错误

KeyError: "['group'] not in index"

我不知道怎么了，我跑了同样的路线，得到了答案用不同的方法来解决它

尝试：

将为您提供所需的解决方案

从每组中获取最大值

dfupdated.groupby（['group']，sort=False）['score'].max（）

Python没有本机的“数据表”，您也没有尝试过做任何事情，这使得您的问题无法回答。您使用的是什么数据库？你试过了吗？@martineau，我更新了我的question@UtpalDutt，python和数据框架如果你打印你的df，你会在那里设置组列吗？或者它有后缀吗？感谢它的工作，我怎样才能返回每组分数的最大值？？对于DFUpdatedUpdatedAnswer@omerali20

dfupdated = df1.merge(df2, how='left', on=['student No'], suffixes=('', '_new'))
dfupdated['group'] = np.where(pd.notnull(dfupdated['group_new']), dfupdated['group_new'],
                                         dfupdated['group'])
dfupdated['score'] = np.where(pd.notnull(dfupdated['score_new']), dfupdated['score_new'],
                                         dfupdated['score'])
dfupdated.drop(['group_new', 'score_new'],axis=1, inplace=True)
dfupdated.reset_index(drop=True, inplace=True)

KeyError: "['group'] not in index"

dfupdated = df1.merge(df2, on='student No', how='left')
dfupdated['group'] = dfupdated['group_y'].fillna(dfupdated['group_x'])
dfupdated['score'] = dfupdated['score_y'].fillna(dfupdated['score_x'])
dfupdated.drop(['group_x', 'group_y','score_x', 'score_y'], axis=1,inplace=True)