Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/323.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫:数据帧中的联合重复字符串_Python_Pandas - Fatal编程技术网

Python 熊猫:数据帧中的联合重复字符串

Python 熊猫:数据帧中的联合重复字符串,python,pandas,Python,Pandas,我有数据帧。这是他们的一部分 member_id event_time event_path event_duration \ 0 2333678 2016-12-27 04:17:16 youtube.com/watch?v=w5ZIb05NO58 12 1 2333678 2016-12-27 04:17:26 youtube.com/watch?v=w5ZIb05N

我有数据帧。这是他们的一部分

       member_id          event_time                       event_path    event_duration  \
0        2333678 2016-12-27 04:17:16  youtube.com/watch?v=w5ZIb05NO58    12  
1        2333678 2016-12-27 04:17:26  youtube.com/watch?v=w5ZIb05NO58     12 
2        2333678 2016-12-27 04:17:36  youtube.com/watch?v=w5ZIb05NO58   10   
3        2333678 2016-12-27 04:17:40  youtube.com/watch?v=w5ZIb05NO58   35   
4        5611206 2016-12-30 17:16:01  youtube.com/watch?v=qZrQWA5IsKA   35   
5        5611206 2016-12-30 17:16:10  youtube.com/watch?v=qZrQWA5IsKA    12  
6        5611206 2016-12-30 17:16:27  youtube.com/watch?v=6YM5UhnElcE   10   
7        5611206 2016-12-30 17:16:37  youtube.com/watch?v=6YM5UhnElcE   10   
8        5611206 2016-12-30 17:16:47  youtube.com/watch?v=6YM5UhnElcE   10
期望输出

       member_id          event_time                       event_path   event_duration
0        2333678 2016-12-27 04:17:16  youtube.com/watch?v=w5ZIb05NO58    69     
4        5611206 2016-12-30 17:16:01  youtube.com/watch?v=qZrQWA5IsKA    47    
6        5611206 2016-12-30 17:16:27  youtube.com/watch?v=6YM5UhnElcE    30      
我用


但它并不包含所有字符串。

如果您想从
事件时间
中为每个组设置第一项,可以使用以下内容(您还将其用于
事件路径
):

输出:

  member_id                       event_path           event_time  \
0   2333678  youtube.com/watch?v=w5ZIb05NO58  2016-12-27 04:17:16   
1   5611206  youtube.com/watch?v=6YM5UhnElcE  2016-12-30 17:16:27   
2   5611206  youtube.com/watch?v=qZrQWA5IsKA  2016-12-30 17:16:01   

   event_duration  
0              69  
1              30  
2              47  

您有不同的
事件时间
。你想在小组里选第一个吗?
>>> df.groupby([df.member_id, df.event_path]).agg({'event_duration':'sum', 'event_time': 'first'}).reset_index().reindex(columns=df.columns)

    member_id event_time                       event_path  event_duration
0  2016-12-27   04:17:16  youtube.com/watch?v=w5ZIb05NO58              69
1  2016-12-30   17:16:27  youtube.com/watch?v=6YM5UhnElcE              30
2  2016-12-30   17:16:01  youtube.com/watch?v=qZrQWA5IsKA              47
    df.groupby(['member_id','event_path']).agg({'event_time':'min','event_duration':'sum'}).reset_index()
  member_id                       event_path           event_time  \
0   2333678  youtube.com/watch?v=w5ZIb05NO58  2016-12-27 04:17:16   
1   5611206  youtube.com/watch?v=6YM5UhnElcE  2016-12-30 17:16:27   
2   5611206  youtube.com/watch?v=qZrQWA5IsKA  2016-12-30 17:16:01   

   event_duration  
0              69  
1              30  
2              47