在Python中生成和压缩两个列表的最干净、最高效的方法
鉴于这两份清单在Python中生成和压缩两个列表的最干净、最高效的方法,python,list,list-comprehension,Python,List,List Comprehension,鉴于这两份清单 zeros = [2,3,1,2] ones = [3,4,5] (条件总是len(零)==len(一)+1) 我想创建一个列表,交替使用列表中提到的0和1的大小。我可以通过以下方式实现这一目标: zeros_list = [[0]*n for n in zeros] ones_list = [[1]*n for n in ones] output = [z for x in zip(zeros_list, ones_list) for y in x for z in y] o
zeros = [2,3,1,2]
ones = [3,4,5]
(条件总是len(零)==len(一)+1
)
我想创建一个列表,交替使用列表中提到的0和1的大小。我可以通过以下方式实现这一目标:
zeros_list = [[0]*n for n in zeros]
ones_list = [[1]*n for n in ones]
output = [z for x in zip(zeros_list, ones_list) for y in x for z in y]
output += [0]*zeros[-1]
print(output)
> [0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0]
然而,这是最有效/干净的方式吗?我得到了2.66µs±78.8 ns的性能,但我仍然认为这可以在一个线性程序中完成,并且可能更有效您可以使用
itertools.chain
、zip\u longest
和itertools.repeat
创建一个不太混乱的线性程序
>>> list(chain.from_iterable(chain.from_iterable(zip_longest((repeat(0, x) for x in zeros), (repeat(1, x) for x in ones), fillvalue=[]))))
[0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0]
在我的机器上,这需要3.34µs。更重要的是,列表
的包装器需要这段时间。迭代器本身会根据需要生成元素,如果您实际上不需要一次生成所有元素的话
创建一系列表示0运行的(对0中的x重复(0,x)
对象;同样,创建1的组重复
将它们压缩成一系列对,添加一个不做任何事情的空列表,以平衡zip_longest
零中的额外值
将序列(从chain.from\u iterable
展平到(a,b)、(c,d)
)(a,b,c,d)
- 外部
然后将链。from_iterable
对象展平为单个序列,该序列将repeat
变成一个列表列表
您还可以使用
itertools
文档中的roundrobin
方法简化一行程序,该方法处理合并零组和一组以及第一轮展平
from itertools import cycle, islice, repeat, chain
zeros = [2,3,1,2]
ones = [3,4,5]
# From the itertools documentation
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
zero_groups = (repeat(0, x) for x in zeros)
one_groups = (repeat(1, x) for x in ones)
print(list(chain.from_iterable(roundrobin(zero_groups, one_groups))))
带列表的Zip应该可以做到这一点
zeros = [2,3,1,2]
ones = [3,4,5]
output = [B for N,P in zip(zeros,ones+[0]) for B in [0]*N+[1]*P]
print(output)
[0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0]
注意
ones+[0]
是为了确保在zip操作中不会从零列表中删除最后一个值。两个明显更快的解决方案,使用“技巧”对第一个零进行特殊处理,而不是最后一个,使用它初始化输出
def superb_rain(zeros, ones):
zeros = iter(zeros)
output = [0] * next(zeros)
for o in ones:
output += (1,) * o
output += (0,) * next(zeros)
return output
(正如@Schwobasegl所指出的,元组使速度提高了30%左右。)
基准结果:
0.14 us 0.13 us 0.13 us baseline
3.04 us 3.02 us 2.98 us original
3.27 us 3.19 us 3.29 us chepner_1
5.03 us 5.12 us 5.25 us chepner_2
4.66 us 4.74 us 4.68 us chepner_2__superb_rain
2.52 us 2.53 us 2.47 us Alain_T
3.35 us 3.27 us 3.42 us python_user
1.02 us 0.99 us 1.04 us superb_rain
1.07 us 1.11 us 1.09 us superb_rain2
基准代码:
import timeit
from itertools import zip_longest, cycle, islice, repeat, chain
def baseline(zeros, ones):
pass
def original(zeros, ones):
zeros_list = [[0]*n for n in zeros]
ones_list = [[1]*n for n in ones]
output = [z for x in zip(zeros_list, ones_list) for y in x for z in y]
output += [0]*zeros[-1]
return output
def chepner_1(zeros, ones):
return list(chain.from_iterable(chain.from_iterable(zip_longest((repeat(0, x) for x in zeros), (repeat(1, x) for x in ones), fillvalue=[]))))
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
def chepner_2(zeros, ones):
zero_groups = (repeat(0, x) for x in zeros)
one_groups = (repeat(1, x) for x in ones)
return list(chain.from_iterable(roundrobin(zero_groups, one_groups)))
def chepner_2__superb_rain(zeros, ones):
return list(chain.from_iterable(map(repeat, cycle([0, 1]), roundrobin(zeros, ones))))
def Alain_T(zeros, ones):
return [B for N,P in zip(zeros+[0],ones+[0]) for B in ([0]*N+[1]*P)]
def python_user(zeros, ones):
res = [None] * (len(ones) + len(zeros))
res[::2] = ([0]*n for n in zeros)
res[1::2] = ([1]*n for n in ones)
res = [y for x in res for y in x]
return res
def superb_rain(zeros, ones):
zeros = iter(zeros)
output = [0] * next(zeros)
for o in ones:
output += (1,) * o
output += (0,) * next(zeros)
return output
def superb_rain2(zeros, ones):
z = iter(zeros).__next__
output = [0] * z()
for o in ones:
output += (1,) * o
output += (0,) * z()
return output
funcs = [
baseline,
original,
chepner_1,
chepner_2,
chepner_2__superb_rain,
Alain_T,
python_user,
superb_rain,
superb_rain2,
]
zeros = [2,3,1,2]
ones = [3,4,5]
number = 10**5
expect = original(zeros, ones)
for func in funcs:
print(func(zeros, ones) == expect, func.__name__)
print()
tss = [[] for _ in funcs]
for _ in range(4):
for func, ts in zip(funcs, tss):
t = min(timeit.repeat(lambda: func(zeros, ones), number=number)) / number
ts.append(t)
print(*('%.2f us ' % (1e6 * t) for t in ts[1:]), func.__name__)
print()
使用numpy是一个选项吗?好的,用zip\u longest
替换zip
比我想象的要简单,但是我认为如果你不把roundrobin
的定义计算在一行中,使用roundrobin
也很有趣。:list(chain.from\u iterable)(map(repeat,cycle([0,1])),roundrobin(0,1)))
@superbrain好得多,虽然有点慢(在我的机器上是4.88µs)。我想我从来没有费心为我的roundrobin解决方案计时,只有zip_最长的一个。但是,我认为,如果你能利用惰性,只在生成值时使用这些值,时间就无关紧要了:用于链中的x。从_iterable(…):#使用x。为什么zeros+[0]
?但我们保证zeros
更长。这看起来也很奇怪,假设我们从零开始,因此如果one
更长,它的最后两条条纹实际上只会是一条条纹。比如,[3]和[2,4]应该是[3]和[6],没错。我没看透。因此,在任何情况下,“0+[0]”都是有用的。我会删除它…使用元组会减少另外30%的时间:output+=(1,)*o
很高兴让它运行,看着名字出现,酷,看一看+1@schwobaseggl谢谢,确认并更新:)喜欢这个把戏!竖起大拇指!
0.14 us 0.13 us 0.13 us baseline
3.04 us 3.02 us 2.98 us original
3.27 us 3.19 us 3.29 us chepner_1
5.03 us 5.12 us 5.25 us chepner_2
4.66 us 4.74 us 4.68 us chepner_2__superb_rain
2.52 us 2.53 us 2.47 us Alain_T
3.35 us 3.27 us 3.42 us python_user
1.02 us 0.99 us 1.04 us superb_rain
1.07 us 1.11 us 1.09 us superb_rain2
import timeit
from itertools import zip_longest, cycle, islice, repeat, chain
def baseline(zeros, ones):
pass
def original(zeros, ones):
zeros_list = [[0]*n for n in zeros]
ones_list = [[1]*n for n in ones]
output = [z for x in zip(zeros_list, ones_list) for y in x for z in y]
output += [0]*zeros[-1]
return output
def chepner_1(zeros, ones):
return list(chain.from_iterable(chain.from_iterable(zip_longest((repeat(0, x) for x in zeros), (repeat(1, x) for x in ones), fillvalue=[]))))
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
def chepner_2(zeros, ones):
zero_groups = (repeat(0, x) for x in zeros)
one_groups = (repeat(1, x) for x in ones)
return list(chain.from_iterable(roundrobin(zero_groups, one_groups)))
def chepner_2__superb_rain(zeros, ones):
return list(chain.from_iterable(map(repeat, cycle([0, 1]), roundrobin(zeros, ones))))
def Alain_T(zeros, ones):
return [B for N,P in zip(zeros+[0],ones+[0]) for B in ([0]*N+[1]*P)]
def python_user(zeros, ones):
res = [None] * (len(ones) + len(zeros))
res[::2] = ([0]*n for n in zeros)
res[1::2] = ([1]*n for n in ones)
res = [y for x in res for y in x]
return res
def superb_rain(zeros, ones):
zeros = iter(zeros)
output = [0] * next(zeros)
for o in ones:
output += (1,) * o
output += (0,) * next(zeros)
return output
def superb_rain2(zeros, ones):
z = iter(zeros).__next__
output = [0] * z()
for o in ones:
output += (1,) * o
output += (0,) * z()
return output
funcs = [
baseline,
original,
chepner_1,
chepner_2,
chepner_2__superb_rain,
Alain_T,
python_user,
superb_rain,
superb_rain2,
]
zeros = [2,3,1,2]
ones = [3,4,5]
number = 10**5
expect = original(zeros, ones)
for func in funcs:
print(func(zeros, ones) == expect, func.__name__)
print()
tss = [[] for _ in funcs]
for _ in range(4):
for func, ts in zip(funcs, tss):
t = min(timeit.repeat(lambda: func(zeros, ones), number=number)) / number
ts.append(t)
print(*('%.2f us ' % (1e6 * t) for t in ts[1:]), func.__name__)
print()