Python 使用时间更改数据帧,并将相关活动更改为每天的活动计数
我想把它转换成类似的东西Python 使用时间更改数据帧,并将相关活动更改为每天的活动计数,python,pandas,time,Python,Pandas,Time,我想把它转换成类似的东西 enroll_id time event source 1 2014-12-11 view server 1 2014-12-13 discuss server 1 2014-12-12 view browser 2 2014-12-11 access browser 1 2014-12-14 discuss server 2 2014-12-13 view bro
enroll_id time event source
1 2014-12-11 view server
1 2014-12-13 discuss server
1 2014-12-12 view browser
2 2014-12-11 access browser
1 2014-12-14 discuss server
2 2014-12-13 view browser
我首先按照注册id进行分组。
我是。使用组图的想法如箭头所示,+。及
在数据透视后轻松获取列名
enroll_id view_d1 access_d1 discuss_d1 browser_day1 server_day1 view_d2 access_d2 discuss_d2 browser_day2 server_day2 view_d3 access_d3 discuss_d3 browser_day3 server_day3
1 1 NaN NaN NaN 1 1 NaN NaN
2 2 1 2 Nan
df2 = (df.rename(columns = {'source' : 'day', 'event' : 'd'})
.assign(time=pd.factorize(df['time'])[0]+1)
.melt(['enroll_id', 'time']))
new_df = (df2.pivot_table(index='enroll_id',
columns=['value', 'variable', 'time'],
aggfunc='size')
.sort_index(level=[2, 1, 0],
ascending=[True, True, False],
axis=1))
new_df = (new_df.set_axis([f'{x}_{y}{z}' for x, y, z in new_df.columns], axis=1)
.reset_index())
print(new_df)
enroll_id view_d1 access_d1 server_day1 browser_day1 view_d2 \
0 1 1.0 NaN 1.0 NaN NaN
1 2 NaN 1.0 NaN 1.0 1.0
discuss_d2 server_day2 browser_day2 view_d3 browser_day3 discuss_d4 \
0 1.0 1.0 NaN 1.0 1.0 1.0
1 NaN NaN 1.0 NaN NaN NaN
server_day4
0 1.0
1 NaN