Python 按连续顺序对整数列表进行分组
我有一个整数列表Python 按连续顺序对整数列表进行分组,python,Python,我有一个整数列表 [1,2,3,4,5,8,9,10,11,200,201,202] 我想将它们分组到一个列表列表中,其中每个子列表包含序列未被破坏的整数。像这样 [[1,5],[8,11],[200,202]] 我有一个相当笨重的工作 lSequenceOfNum = [1,2,3,4,5,8,9,10,11,200,201,202] lGrouped = [] start = 0 for x in range(0,len(lSequenceOfNum)): if x != le
[1,2,3,4,5,8,9,10,11,200,201,202]
我想将它们分组到一个列表列表中,其中每个子列表包含序列未被破坏的整数。像这样
[[1,5],[8,11],[200,202]]
我有一个相当笨重的工作
lSequenceOfNum = [1,2,3,4,5,8,9,10,11,200,201,202]
lGrouped = []
start = 0
for x in range(0,len(lSequenceOfNum)):
if x != len(lSequenceOfNum)-1:
if(lSequenceOfNum[x+1] - lSequenceOfNum[x]) > 1:
lGrouped.append([lSequenceOfNum[start],lSequenceOfNum[x]])
start = x+1
else:
lGrouped.append([lSequenceOfNum[start],lSequenceOfNum[x]])
print lGrouped
这是我能做的最好的了。有没有更“蟒蛇式”的方法?谢谢..伪代码(需要修复的错误为off by one):
我认为,在这种情况下,使用索引(如在C中,在java/python/perl/等在此基础上改进之前)而不是数组中的对象是有意义的。以下是一个易于阅读的版本:
def close_range(el, it):
while True:
el1 = next(it, None)
if el1 != el + 1:
return el, el1
el = el1
def compress_ranges(seq):
iterator = iter(seq)
left = next(iterator, None)
while left is not None:
right, left1 = close_range(left, iterator)
yield (left, right)
left = left1
list(compress_ranges([1, 2, 3, 4, 5, 8, 9, 10, 11, 200, 201, 202]))
假设列表始终按升序排列:
from itertools import groupby, count
numberlist = [1,2,3,4,5,8,9,10,11,200,201,202]
def as_range(g):
l = list(g)
return l[0], l[-1]
print [as_range(g) for _, g in groupby(numberlist, key=lambda n, c=count(): n-next(c))]
我意识到,与使用稍微复杂的生成器相比,手动计数要简单得多:
def ranges(seq):
start, end = seq[0], seq[0]
count = start
for item in seq:
if not count == item:
yield start, end
start, end = item, item
count = item
end = item
count += 1
yield start, end
print(list(ranges([1,2,3,4,5,8,9,10,11,200,201,202])))
制作:
[(1, 5), (8, 11), (200, 202)]
[(1, 5), (8, 11), (200, 202)]
这种方法非常快:
此方法(与旧方法的性能几乎完全相同):
:
这比以前快了20倍——当然,除非速度很重要,否则这不是真正的问题
使用生成器的旧解决方案:
import itertools
def resetable_counter(start):
while True:
for i in itertools.count(start):
reset = yield i
if reset:
start = reset
break
def ranges(seq):
start, end = seq[0], seq[0]
counter = resetable_counter(start)
for count, item in zip(counter, seq): #In 2.x: itertools.izip(counter, seq)
if not count == item:
yield start, end
start, end = item, item
counter.send(item)
end = item
yield start, end
print(list(ranges([1,2,3,4,5,8,9,10,11,200,201,202])))
制作:
[(1, 5), (8, 11), (200, 202)]
[(1, 5), (8, 11), (200, 202)]
您可以通过三个步骤有效地完成此操作 给定 计算不连续性
[1,2,3,4,5,8,9,10,11 ,200,201,202]
- [1,2,3,4,5,8,9 ,10 ,11 ,200,201,202]
----------------------------------------
[1,1,1,1,3,1,1 ,1 ,189,1 ,1]
(index) 1 2 3 4 5 6 7 8 9 10 11
* *
rng = [i+1 for i,e in enumerate((x-y for x,y in zip(list1[1:],list1))) if e!=1]
>>> rng
[5, 9]
添加边界
rng = [0] + rng + [len(list1)]
>>> rng
[0, 5, 9,12]
现在计算实际的连续性范围
[(list1[i],list1[j-1]) for i,j in zip(list2,list2[1:])]
[(1, 5), (8, 11), (200, 202)]
LB [0, 5, 9, 12]
UB [0, 5, 9, 12]
-----------------------
indexes (LB,UB-1) (0,4) (5,8) (9,11)
类似问题:这个问题很老了,但我想我还是会分享我的解决方案 假设
将numpy导入为np
a = [1,2,3,4,5,8,9,10,11,200,201,202]
np.split(a, array(np.add(np.where(diff(a)>1),1)).tolist()[0])
从跳跃的位置而不是范围的位置来考虑。您可以将结果存储在简单的整数数组中,其中每个条目都是与原始数组中的跳转相对应的索引。我认为这更简单。。。很可能这将是可重用的或库代码,您可以将所有这些封装到类的工作中。我很确定这是一个副本,尽管我现在无法查找它。@Abhijit非常确定,我测试过它。你发现它失败了吗?不确定,但o/p不是预期的。你能看看这个刚刚检查过的@Abhijit吗?这似乎是Python2.x和3.x的问题。在3.x下工作正常。。。我会试着找出原因。当然,
zip()
在2.x中不是懒惰的-你需要的-现在
[(list1[i],list1[j-1]) for i,j in zip(list2,list2[1:])]
[(1, 5), (8, 11), (200, 202)]
LB [0, 5, 9, 12]
UB [0, 5, 9, 12]
-----------------------
indexes (LB,UB-1) (0,4) (5,8) (9,11)
input = [1, 2, 3, 4, 8, 10, 11, 12, 17]
i, ii, result = iter(input), iter(input[1:]), [[input[0]]]
for x, y in zip(i,ii):
if y-x != 1:
result.append([y])
else:
result[-1].append(y)
>>> result
[[1, 2, 3, 4], [8], [10, 11, 12], [17]]
>>> print ", ".join("-".join(map(str,(g[0],g[-1])[:len(g)])) for g in result)
1-4, 8, 10-12, 17
>>> [(g[0],g[-1])[:len(g)] for g in result]
[(1, 4), (8,), (10, 12), (17,)]
a = [1,2,3,4,5,8,9,10,11,200,201,202]
np.split(a, array(np.add(np.where(diff(a)>1),1)).tolist()[0])