Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/314.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何按半唯一值筛选列表_Python - Fatal编程技术网

Python 如何按半唯一值筛选列表

Python 如何按半唯一值筛选列表,python,Python,我有一个数据集,需要对其进行“唯一”筛选。基本上,我想删除同一用户每天不止一次购买同一产品的每一行,而不考虑可变设备。在多次出现的情况下,我希望只保留第一行 数据: datetime, device, product, user [ ['2013-07-08 15:00:00', 'pc', 'X', 'A'], ['2013-07-09 17:00:00', 'pc', 'X', 'A'], ['2013-07-09 10:

我有一个数据集,需要对其进行“唯一”筛选。基本上,我想删除同一用户每天不止一次购买同一产品的每一行,而不考虑可变设备。在多次出现的情况下,我希望只保留第一行

数据:

datetime, device, product, user

  [
  ['2013-07-08 15:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 17:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 10:00:00', 'andr',     'Y',        'B'],
  ['2013-07-10 18:00:00', 'pc',       'Y',        'B'],
  ['2013-07-10 21:00:00', 'ipho',     'Y',        'B'],       <- second occurance of B getting Y that day
  ['2013-07-10 22:00:00', 'andr',     'Y',        'B'],       <- third occurance of B getting Y that day
  ['2013-07-10 02:00:00', 'ipho',     'Z',        'C'],
  ['2013-07-10 11:00:00', 'pc',       'Z',        'C']        <- second occurance of C getting Z that day
  ]

我该怎么做呢?

从日期时间中去掉时间部分,然后将每个项目存储在字典中(如果还没有)。作为字典的键,使用日期、产品、用户的元组

例如


从datetime中去掉时间部分,然后将每个项存储在字典中(如果尚未存储)。作为字典的键,使用日期、产品、用户的元组

例如

  ['2013-07-08 15:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 17:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 10:00:00', 'andr',     'Y',        'B'],
  ['2013-07-10 18:00:00', 'pc',       'Y',        'B'],
  ['2013-07-10 02:00:00', 'ipho',     'Z',        'C'],
  ['2013-07-10 11:00:00', 'pc',       'Z',        'C']
 d = {}
 for datetime, device, product, user in table:
     date = datetime[:10]
     if (date, product, user) not in d:
         d[(date, product, user)] = [datetime, device, product, user]