Python 基于其他列计算列
这是一个数据集,我想找出哪个用户查看哪个项目少于10秒(deep_view-view你能提供一个最终输出的示例吗?@Mishi我已经添加了这个示例…谢谢你…@Lgiro我认为这与那个不同…这包含行之间的计算…为什么用户ID格式在第一组数据和你说“这是部分输出”的点之间发生变化:?对不起,第一组数据只是整个数据集的一部分…@pshep123Python 基于其他列计算列,python,pandas,Python,Pandas,这是一个数据集,我想找出哪个用户查看哪个项目少于10秒(deep_view-view你能提供一个最终输出的示例吗?@Mishi我已经添加了这个示例…谢谢你…@Lgiro我认为这与那个不同…这包含行之间的计算…为什么用户ID格式在第一组数据和你说“这是部分输出”的点之间发生变化:?对不起,第一组数据只是整个数据集的一部分…@pshep123 user_id item_id action_type action_time 0 0
user_id item_id action_type action_time
0 0365F7AE-5048-42B3-BB2C-8E637A380A3E 557082 view 1487423564
1 0365F7AE-5048-42B3-BB2C-8E637A380A3E 557166 view 1487424075
2 0365F7AE-5048-42B3-BB2C-8E637A380A3E 555824 view 1487424241
5 0365F7AE-5048-42B3-BB2C-8E637A380A3E 555824 deep_view 1487424345
3 0365F7AE-5048-42B3-BB2C-8E637A380A3E 554390 view 1487424395
4 0365F7AE-5048-42B3-BB2C-8E637A380A3E 557166 deep_view 1487424175
6 0365F7AE-5048-42B3-BB2C-8E637A380A3E 557082 deep_view 1487423680
7 0365F7AE-5048-42B3-BB2C-8E637A380A3E 554390 deep_view 1487424422
8 06068254-792D-4AFE-AC6C-DE43DB15D735 556134 view 1487417354
9 06068254-792D-4AFE-AC6C-DE43DB15D735 556134 deep_view 1487417411
10 06068254-792D-4AFE-AC6C-DE43DB15D735 550176 view 1487400366
11 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 519444 view 1487415176
12 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555729 deep_view 1487412841
13 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555171 deep_view 1487414707
14 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555744 view 1487412883
15 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555757 view 1487412616
16 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555337 view 1487414331
17 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555784 view 1487413081
18 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555653 view 1487412036
19 077F63F3-3DF4-4041-B3C9-7BAB2BDCA795 555537 view 1487413842
def short_time(data):
data = data.sort_values(by=['user_id', 'action_time'])
id = []
for i in range(data.shape[0] - 1):
if data['action_type'][data.index[i]] == 'view' and data['action_type'][data.index[i + 1]] == 'deep_view' and \
data['user_id'][data.index[i]] == data['user_id'][data.index[i + 1]] \
and data['item_id'][data.index[i]] == data['item_id'][data.index[i + 1]]:
if data['action_time'][data.index[i + 1]] - data["action_time"][data.index[i]] < 10:
id.append(data.index[i])
return data.loc[id, :]
user_id item_id action_type action_time
301800 135973 558284 view 1487386449
457083 149124 544766 view 1487349898
203814 1258134 538039 view 1487382777
454537 1489322 550339 view 1487419315
131863 11703060 553010 view 1487424398
132345 11705467 546168 view 1487369955
137092 11761967 471721 view 1487425655
137236 11765536 539269 view 1487370412
137229 11765536 542229 view 1487370465
137238 11765536 462871 view 1487370491
137241 11765536 542217 view 1487370845
137276 11765536 550339 view 1487379656
137263 11765536 539302 view 1487379832
137275 11765536 541951 view 1487380143
137278 11765536 550737 view 1487381556
137208 11765536 541946 view 1487412335
138095 11776713 552341 view 1487413089
138898 11783870 542197 view 1487406728
138904 11783870 542235 view 1487406763
138903 11783870 541683 view 1487407348
138905 11783870 496537 view 1487407465
139175 11785631 550982 view 1487384606
user_id item_id action_type action_time
2 0365F7AE-5048-42B3-BB2C-8E637A380A3E 555824 view 1487424241
5 0365F7AE-5048-42B3-BB2C-8E637A380A3E 555824 deep_view 1487424345