Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/314.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:如何在groupby之后适当地连接回df_Python_Python 3.x_Pandas - Fatal编程技术网

Python:如何在groupby之后适当地连接回df

Python:如何在groupby之后适当地连接回df,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个df: df = pd.DataFrame({'CaseNo':[1,1,1,1,2,2,2,2], 'Movement_Sequence_No':[1,2,3,4,1,2,3,4], 'Movement_Start_Date':['2020-02-09 22:17:00','2020-02-10 17:19:41','2020-02-17 08:04:19',

我有一个df:

df = pd.DataFrame({'CaseNo':[1,1,1,1,2,2,2,2],
                    'Movement_Sequence_No':[1,2,3,4,1,2,3,4],
                    'Movement_Start_Date':['2020-02-09 22:17:00','2020-02-10 17:19:41','2020-02-17 08:04:19',
                                           '2020-02-18 11:22:52','2020-02-12 23:00:00','2020-02-24 10:26:35',
                                           '2020-03-03 17:50:00','2020-03-17 08:24:19'],
                    'Movement_End_Date':['2020-02-10 17:19:41','2020-02-17 08:04:19','2020-02-18 11:22:52',
                                         '2020-02-25 13:55:37','2020-02-24 10:26:35','2020-03-03 17:50:00',
                                         '2222-12-31 23:00:00','2020-03-18 18:50:00'],
                    'Category':['A','A','ICU','A','B','B','B','B'],
                    'RequestDate':['2020-02-10 16:00:00','2020-02-16 13:04:20','2020-02-18 07:11:11','2020-02-21 21:30:30',
                                   '2020-02-13 22:00:00','NA','2020-03-15 09:40:00','2020-03-18 15:10:10'],
                    'Test1':['180','189','190','188','328','NA','266','256'],
                    'Test2':['20','21','15','10','33','30','28','15'],
                    'Test3':['55','NA','65','70','58','64','68','58'],
                    'Age':['65','65','65','65','45','45','45','45']})

在完成一些填充缺失值的处理后,我得到了df2:

# Format df appropriately
df = df.replace('NA', np.nan)
df[['Test1','Test2','Test3','Age']] = df[['Test1','Test2','Test3','Age']].astype(float)

# helper column to segregate non-ICU cols by value 0
df["helper"] = df.groupby("CaseNo")["Category"].transform(lambda d: d.eq("ICU").cumsum())

df2 = df.loc[df["helper"].eq(0)].groupby("CaseNo", as_index=False).fillna(
    method='ffill').reset_index().drop('index', axis=1)  # ffill will fill NA w the latest/prev test value

如何将df2适当地合并回df,以便在df中更新更改? 预期结果:


据我所知,您可以在设置2个条件后尝试

out = df.replace('NA',np.nan)
cond = out['Category'].ne('ICU') & out['RequestDate'].isna()
out = out.groupby('CaseNo',as_index=False).fillna(method='ffill').where(cond,df)


#if you want Test3 in row 2 to be NaN and not 'NA'
#out = out.groupby('CaseNo',as_index=False).fillna(method='ffill').where(cond,out)

display(out)


您能否解释一下
cond=out['Category'].ne('ICU')和out['RequestDate'].isna()
?我们在这里创建了两个条件,即类别不等于ICU,请求日期为空。然后groupby和ffill在不满足条件的情况下保留所有值,因此ICU保留为@spidermarn