Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何更新列';基于另一行的数据帧中的s值?_Python_Pandas - Fatal编程技术网

Python 如何更新列';基于另一行的数据帧中的s值?

Python 如何更新列';基于另一行的数据帧中的s值?,python,pandas,Python,Pandas,假设我有以下数据框: id action timestamp time_difference opened 1 sent 2017-06-29 18:38:03 _NaN_ _NaN_ 1 clicked 2017-06-29 18:40:03 _NaN_ _NaN_ 2 sent 2017-06-29 18:38:03 _NaN_ _NaN_ 我希望最终结果是一行,第二行合并到前一行。将计算时

假设我有以下数据框:

id action  timestamp           time_difference opened
1  sent    2017-06-29 18:38:03 _NaN_           _NaN_
1  clicked 2017-06-29 18:40:03 _NaN_           _NaN_
2  sent    2017-06-29 18:38:03 _NaN_           _NaN_
我希望最终结果是一行,第二行合并到前一行。将计算时差标签,如果找到具有“已单击”状态的匹配id,则打开的标签将设置为1

id action  timestamp           time_difference opened
1  sent    2017-06-29 18:38:03 00:02:00        1
2  sent    2017-06-29 18:38:03 _NaN_           0

创建两个数据集-一个用于发送,一个用于单击,并在id上合并它们,然后进行计算

import pandas as pd
df['timestamp'] = pd.to_datetime(df['timestamp'])
df_sent = df[df['action']=='sent'][['id', 'timestamp']]
df_clicked = df[df['action']=='clicked'][['id', 'timestamp']]
df_clicked.columns = ['id', 'ts_clicked']

dfm = df_sent.merge(df_clicked, on='id', how='left')
dfm['time_difference'] = dfm['ts_clicked'] - dfm['timestamp']
dfm['opened'] = pd.notnull(dfm['ts_clicked'])*1
dfm = dfm[['id', 'timestamp', 'time_difference', 'opened']]