如何在python中自动查找列表中元素的起始索引和结束索引_Python

如何在python中自动查找列表中元素的起始索引和结束索引

python

如何在python中自动查找列表中元素的起始索引和结束索引,python,Python,我想找到列表中所有userId的开始和结束索引，我不想指定每个userId，因为数据集很大 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1.......213,213,213,213] 我希望输出是 [{1: 0, 20},{2: 21, 40}.....{213: 29,703, 30,000}] python中是否有可以自动执行此操作的包或函数？您可以执行以下操作： from collections import Counter a = ... a_counter = C

我想找到列表中所有userId的开始和结束索引，我不想指定每个userId，因为数据集很大

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1.......213,213,213,213]

我希望输出是

[{1: 0, 20},{2: 21, 40}.....{213: 29,703, 30,000}]

python中是否有可以自动执行此操作的包或函数？

您可以执行以下操作：

from collections import Counter

a = ...

a_counter = Counter(a)
a_indices = []

running_count = 0

for x, x_count in sorted(a_counter.items()):
   a_indices.append({x: (running_count, running_count + x_count - 1)}) 
   running_count += x_count

例如，如果

a=[1,1,2,2,3,3]

，

a_索引=[{1:（0,1）}，{2:（2,3）}，{3:（4,5）}]

（最接近您的输出格式，但有效）

如果您愿意稍微更改输出格式，请使用：

a_indices = {}

running_count = 0

for x, x_count in sorted(a_counter.items()):
   a_indices[x] = (running_count, running_count + x_count - 1) 
   running_count += x_count

现在

a_索引

，对于上面的

，将是

{1:（0，1），2:（2，3），3:（4，5）}

，一个更好的结构

这两种解决方案都将使

的每个端点索引都包含在内。如果要使其具有独占性，请将

运行\u count+x\u count-1

替换为

运行\u count+x\u count

numpy。如果数据集已排序，则使用返回\u index=True
替换为unique

将起作用。如果您只使用Python列表，那么自己迭代数组可能会更快。这只是一个开销很低的O（n）操作。您希望的输出不是有效的python。这些值应该是元组吗？