Python 查找列表中的最长运行时间_Python_List

Python 查找列表中的最长运行时间

python list

Python 查找列表中的最长运行时间,python,list,Python,List,给定一个数据列表，我试图创建一个新列表，其中位置I处的值是原始列表中从位置I开始的最长行程的长度。例如，给定 x_list = [1, 1, 2, 3, 3, 3] 应返回： run_list = [2, 1, 1, 3, 2, 1] 我的解决方案： freq_list = [] current = x_list[0] count = 0 for num in x_list: if num == current: count += 1 else:

给定一个数据列表，我试图创建一个新列表，其中位置

处的值是原始列表中从位置

开始的最长行程的长度。例如，给定

x_list = [1, 1, 2, 3, 3, 3]

应返回：

run_list = [2, 1, 1, 3, 2, 1]

我的解决方案：

freq_list = []
current = x_list[0]
count = 0
for num in x_list:
    if num == current:
        count += 1
    else:
        freq_list.append((current,count))
        current = num
        count = 1
freq_list.append((current,count))

run_list = []
for i in freq_list:
    z = i[1]
    while z > 0:
        run_list.append(z)
        z -= 1

首先，我创建一个元组列表

freq\u list

，其中每个元组的第一个元素是

x\u list

中的元素，第二个元素是总运行次数

在这种情况下：

freq_list = [(1, 2), (2, 1), (3, 3)]

有了这个，我创建了一个新列表并附加了适当的值

然而，我想知道是否有一种更短的方法/另一种方法可以做到这一点

这可以使用

itertools

：

from itertools import groupby, chain

x_list = [1, 1, 2, 3, 3, 3]

gen = (range(len(list(j)), 0, -1) for _, j in groupby(x_list))
res = list(chain.from_iterable(gen))

结果

[2, 1, 1, 3, 2, 1]

解释

首先使用
```
itertools.groupby
```
对列表中相同的项目进行分组
对于
```
groupby
```
中的每个项目，创建一个
```
range
```
对象，该对象从连续项目数的长度向后计数到1
将所有这些都转换为生成器，以避免生成列表列表
使用
```
itertools.chain
```
从生成器链接范围

性能说明

性能将不如。尽管

itertools.groupby

是O（n），但它大量使用昂贵的

\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。对于

循环，在简单的中，它们的伸缩性不如迭代。有关

groupby

伪代码，请参阅

如果性能是您主要关心的问题，请坚持使用

for

循环。

这里有一个简单的解决方案，它向后迭代列表，每次重复一个数字时递增一个计数器：

last_num = None
result = []
for num in reversed(x_list):
    if num != last_num:
        # if the number changed, reset the counter to 1
        counter = 1
        last_num = num
    else:
        # if the number is the same, increment the counter
        counter += 1

    result.append(counter)

# reverse the result
result = list(reversed(result))

结果:

[2, 1, 1, 3, 2, 1]

您正在对连续组执行反向累积计数。我们可以使用

import numpy as np

def cumcount(a):
    a = np.asarray(a)
    b = np.append(False, a[:-1] != a[1:])
    c = b.cumsum()
    r = np.arange(len(a))
    return r - np.append(0, np.flatnonzero(b))[c] + 1

然后使用

a = np.array(x_list)

cumcount(a[::-1])[::-1]

array([2, 1, 1, 3, 2, 1])

对于此类任务，我会使用生成器，因为它避免了以增量方式构建结果列表，并且如果需要，可以惰性地使用：

def gen(iterable):  # you have to think about a better name :-)
    iterable = iter(iterable)
    # Get the first element, in case that fails
    # we can stop right now.
    try:
        last_seen = next(iterable)
    except StopIteration:
        return
    count = 1

    # Go through the remaining items
    for item in iterable:
        if item == last_seen:
            count += 1
        else:
            # The consecutive run finished, return the
            # desired values for the run and then reset
            # counter and the new item for the next run.
            yield from range(count, 0, -1)
            count = 1
            last_seen = item
    # Return the result for the last run
    yield from range(count, 0, -1)

如果输入不能反转（某些生成器/迭代器不能反转），这也会起作用：

它适用于您的输入：

>>> x_list = [1, 1, 2, 3, 3, 3]
>>> list(gen(x_list))
[2, 1, 1, 3, 2, 1]

使用

itertools.groupby

，实际上可以简化这一过程：

import itertools

def gen(iterable):
    for _, group in itertools.groupby(iterable):
        length = sum(1 for _ in group)  # or len(list(group))
        yield from range(length, 0, -1)

>>> x_list = [1, 1, 2, 3, 3, 3]
>>> list(gen(x_list))
[2, 1, 1, 3, 2, 1]

我还做了一些基准测试，根据这些测试，Aran Feys解决方案是最快的，除了PirSquares解决方案获胜的长列表：

如果要确认结果，这是我的基准测试设置：

from itertools import groupby, chain
import numpy as np

def gen1(iterable):
    iterable = iter(iterable)
    try:
        last_seen = next(iterable)
    except StopIteration:
        return
    count = 1
    for item in iterable:
        if item == last_seen:
            count += 1
        else:
            yield from range(count, 0, -1)
            count = 1
            last_seen = item
    yield from range(count, 0, -1)

def gen2(iterable):
    for _, group in groupby(iterable):
        length = sum(1 for _ in group)
        yield from range(length, 0, -1)

def mseifert1(iterable):
    return list(gen1(iterable))

def mseifert2(iterable):
    return list(gen2(iterable))

def aran(x_list):
    last_num = None
    result = []
    for num in reversed(x_list):
        if num != last_num:
            counter = 1
            last_num = num
        else:
            counter += 1
        result.append(counter)
    return list(reversed(result))

def jpp(x_list):
    gen = (range(len(list(j)), 0, -1) for _, j in groupby(x_list))
    res = list(chain.from_iterable(gen))
    return res

def cumcount(a):
    a = np.asarray(a)
    b = np.append(False, a[:-1] != a[1:])
    c = b.cumsum()
    r = np.arange(len(a))
    return r - np.append(0, np.flatnonzero(b))[c] + 1

def pirsquared(x_list):
    a = np.array(x_list)
    return cumcount(a[::-1])[::-1]

from simple_benchmark import benchmark
import random

funcs = [mseifert1, mseifert2, aran, jpp, pirsquared]
args = {2**i: [random.randint(0, 5) for _ in range(2**i)] for i in range(1, 20)}

bench = benchmark(funcs, args, "list size")

%matplotlib notebook
bench.plot()

Python 3.6.5、NumPy 1.14

以下是一种简单的迭代方法，可通过以下方式实现：

它将返回您的

运行列表

：

[2, 1, 1, 3, 2, 1]

作为替代方案，这里有一个使用列表理解来实现这一点的线性方法，但由于反复使用

list.index（..）

，因此它的性能效率不高：

您可以对连续相等的项目进行计数，然后将从项目计数到1的倒计时添加到结果中：

def runs(p):
    old = p[0]
    n = 0
    q = []
    for x in p:
        if x == old:
            n += 1
        else:
            q.extend(range(n, 0, -1))
            n = 1
            old = x

    q.extend(range(n, 0, -1))

    return q

（几分钟后）哦，那和我一样，但没有可比性。此版本似乎与。

提示：请尝试向后查看

x\u列表。你注意到任何清晰的模式吗？：）你能评论一下这种方法与@Aran Fey的答案相比有多复杂吗？我使用timeit进行的测试表明，对于长度为5的列表，该解决方案需要5.93秒进行100万次评估，而@Aran Fey的答案需要2.998秒（最好是3秒）。对于长度为10000的列表，您的答案大约需要10.54秒，而对于1000次评估，另一个答案需要0.92秒。groupby之后是不是会对其结果进行迭代使得这种方法如此昂贵？我喜欢这个答案，因为a）原生python，b）O（n）运行时，c）易于阅读/理解
[2, 1, 1, 3, 2, 1]

>>> [x_list[i:].count(x) for i, x in enumerate(x_list)]
[2, 1, 1, 3, 2, 1]

def runs(p):
    old = p[0]
    n = 0
    q = []
    for x in p:
        if x == old:
            n += 1
        else:
            q.extend(range(n, 0, -1))
            n = 1
            old = x

    q.extend(range(n, 0, -1))

    return q