Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/292.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
带有键函数的heapq.nlarest的输出更改顺序(Python)_Python - Fatal编程技术网

带有键函数的heapq.nlarest的输出更改顺序(Python)

带有键函数的heapq.nlarest的输出更改顺序(Python),python,Python,有人能解释一下,当使用只有第一个参数的键函数调用NLAGEST函数时,为什么输出顺序会发生变化 import heapq heap_arr = [(1, 'a'), (2, 'b'), (2, 'b'), (3, 'c'), (3, 'd')] heapq.nlargest(2, x) # Perfectly fine - OP is [(3, 'd'), (3, 'c')] # This is similar to heapq.nlargest(2, x, key=lambda a: (a[

有人能解释一下,当使用只有第一个参数的键函数调用NLAGEST函数时,为什么输出顺序会发生变化

import heapq
heap_arr = [(1, 'a'), (2, 'b'), (2, 'b'), (3, 'c'), (3, 'd')]

heapq.nlargest(2, x)
# Perfectly fine - OP is [(3, 'd'), (3, 'c')]
# This is similar to heapq.nlargest(2, x, key=lambda a: (a[0], a[1]))

heapq.nlargest(2, x, key=lambda a: a[0])
# OP is [(3, 'c'), (3, 'd')]... Why ??

为什么(3,'c')在第二个示例中出现在(3,'d')之前。这个问题背后的原因是输出列表中元组的顺序很重要。

简短回答:

heapq.nlargest(2,heap_arr)
返回
[(3,'d'),(3,'c')]

In [6]: (3, 'd') > (3, 'c')
Out[6]: True
heapq.nlargest(2,heap\u arr,key=lambda:a[0])
返回
[(3,'c'),(3,'d')]
因为
heapq
,像
排序的
一样,使用了一个。由于键匹配(值为3),稳定排序将按项目在
堆中出现的顺序返回项目:

In [8]: heapq.nlargest(2, [(3, 'c'), (3, 'd')], key=lambda a: a[0])
Out[8]: [(3, 'c'), (3, 'd')]

In [9]: heapq.nlargest(2, [(3, 'd'), (3, 'c')], key=lambda a: a[0])
Out[9]: [(3, 'd'), (3, 'c')]

更长的回答:

heapq.nlagest(n,iterable,key)
相当于

sorted(iterable, key=key, reverse=True)[:n]
(尽管
heapq.nlagest
以不同的方式计算其结果)。 然而,我们可以使用此等效性来检查
heapq.nlargest
的行为是否符合我们的预期:

import heapq
heap_arr = [(1, 'a'), (2, 'b'), (2, 'b'), (3, 'c'), (3, 'd')]

assert heapq.nlargest(2, heap_arr) == sorted(heap_arr, reverse=True)[:2]

assert heapq.nlargest(2, heap_arr, key=lambda a: a[0]) == sorted(heap_arr, key=lambda a: a[0], reverse=True)[:2]
因此,如果您接受这种等价性,那么您只需确认

In [47]: sorted(heap_arr, reverse=True)
Out[47]: [(3, 'd'), (3, 'c'), (2, 'b'), (2, 'b'), (1, 'a')]

In [48]: sorted(heap_arr, key=lambda a: a[0], reverse=True)
Out[48]: [(3, 'c'), (3, 'd'), (2, 'b'), (2, 'b'), (1, 'a')]
使用
key=lambda a:a[0]
时,
(3,'c')
(3,'d')
根据 相同的键值,3。因为,两个项目相等 键(例如
(3,'c')
(3,'d')
)在结果中的显示顺序与 它们出现在
heap\u arr


更详细的回答:

要了解实际情况,可以使用调试器,或者简单地将heapq的代码复制到文件中,然后使用print语句来研究堆(即变量
result
)在检查iterable中的元素并可能将其推送到堆中时如何更改。运行此代码:

def heapreplace(heap, item):
    """Pop and return the current smallest value, and add the new item.

    This is more efficient than heappop() followed by heappush(), and can be
    more appropriate when using a fixed-size heap.  Note that the value
    returned may be larger than item!  That constrains reasonable uses of
    this routine unless written as part of a conditional replacement:

        if item > heap[0]:
            item = heapreplace(heap, item)
    """
    returnitem = heap[0]    # raises appropriate IndexError if heap is empty
    heap[0] = item
    _siftup(heap, 0)
    return returnitem

def heapify(x):
    """Transform list into a heap, in-place, in O(len(x)) time."""
    n = len(x)
    # Transform bottom-up.  The largest index there's any point to looking at
    # is the largest with a child index in-range, so must have 2*i + 1 < n,
    # or i < (n-1)/2.  If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so
    # j-1 is the largest, which is n//2 - 1.  If n is odd = 2*j+1, this is
    # (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1.
    for i in reversed(range(n//2)):
        _siftup(x, i)

# 'heap' is a heap at all indices >= startpos, except possibly for pos.  pos
# is the index of a leaf with a possibly out-of-order value.  Restore the
# heap invariant.
def _siftdown(heap, startpos, pos):
    newitem = heap[pos]
    # Follow the path to the root, moving parents down until finding a place
    # newitem fits.
    while pos > startpos:
        parentpos = (pos - 1) >> 1
        parent = heap[parentpos]
        if newitem < parent:
            heap[pos] = parent
            pos = parentpos
            continue
        break
    heap[pos] = newitem


def _siftup(heap, pos):
    endpos = len(heap)
    startpos = pos
    newitem = heap[pos]
    # Bubble up the smaller child until hitting a leaf.
    childpos = 2*pos + 1    # leftmost child position
    while childpos < endpos:
        # Set childpos to index of smaller child.
        rightpos = childpos + 1
        if rightpos < endpos and not heap[childpos] < heap[rightpos]:
            childpos = rightpos
        # Move the smaller child up.
        heap[pos] = heap[childpos]
        pos = childpos
        childpos = 2*pos + 1
    # The leaf at pos is empty now.  Put newitem there, and bubble it up
    # to its final resting place (by sifting its parents down).
    heap[pos] = newitem
    _siftdown(heap, startpos, pos)


def nlargest(n, iterable, key=None):
    """Find the n largest elements in a dataset.

    Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]
    """

    # Short-cut for n==1 is to use max()
    if n == 1:
        it = iter(iterable)
        sentinel = object()
        if key is None:
            result = max(it, default=sentinel)
        else:
            result = max(it, default=sentinel, key=key)
        return [] if result is sentinel else [result]

    # When n>=size, it's faster to use sorted()
    try:
        size = len(iterable)
    except (TypeError, AttributeError):
        pass
    else:
        if n >= size:
            return sorted(iterable, key=key, reverse=True)[:n]

    # When key is none, use simpler decoration
    if key is None:
        it = iter(iterable)
        result = [(elem, i) for i, elem in zip(range(0, -n, -1), it)]
        print('result: {}'.format(result))
        if not result:
            return result
        heapify(result)
        top = result[0][0]
        order = -n
        _heapreplace = heapreplace
        for elem in it:
            print('elem: {}'.format(elem))
            if top < elem:
                _heapreplace(result, (elem, order))
                print('result: {}'.format(result))
                top, _order = result[0]
                order -= 1
        result.sort(reverse=True)
        return [elem for (elem, order) in result]

    # General case, slowest method
    it = iter(iterable)
    result = [(key(elem), i, elem) for i, elem in zip(range(0, -n, -1), it)]
    print('result: {}'.format(result))
    if not result:
        return result
    heapify(result)
    top = result[0][0]
    order = -n
    _heapreplace = heapreplace
    for elem in it:
        print('elem: {}'.format(elem))
        k = key(elem)
        if top < k:
            _heapreplace(result, (k, order, elem))
            print('result: {}'.format(result))
            top, _order, _elem = result[0]
            order -= 1
    result.sort(reverse=True)
    return [elem for (k, order, elem) in result]


heap_arr = [(1, 'a'), (2, 'b'), (2, 'b'), (3, 'c'), (3, 'd')]

nlargest(2, heap_arr)
print('-'*10)
nlargest(2, heap_arr, key=lambda a: a[0]) 
这证明了我们在第(1)行和第(2)行中看到的结果是正确的。 当元组在第一种情况下为(1),
(3,'c')
最后出现在
(3,'d')
之前,而在第二种情况下,(2), 反之亦然

因此,您看到的行为源自这样一个事实:当
键为None
时,iterable中的元素被放置在堆中,就像它们是
(elem,order)
形式的元组一样,其中
order
随着
的每一个heapplace
递减1。 相反,当
键不是None
时,元组的形式是
(k,order,elem)
,其中
k
键(elem)
。元组形式上的这种差异导致了结果上的差异

在第一种情况下,
elem
最终控制订单。在第二种情况下, 由于
k
值相等,
order
最终控制顺序。这个
订单的目的是以稳定的方式断开关系。所以最终我们达到了
与我们检查
排序(heap_arr,key=lambda:a[0]时得出的结论相同,
反向=真)
(3,'c')
(3,'d')
的顺序与它们的顺序相同 当键相等时,按
heap\u arr
排序

如果您希望
a[0]
中的连接被
a
本身断开,请使用

In [53]: heapq.nlargest(2, heap_arr, key=lambda a: (a[0], a))
Out[53]: [(3, 'd'), (3, 'c')]
In [45]: ((3, 'c'), -3) < ((3, 'd'), -4)
Out[45]: True

In [46]: (3, -4, (3, 'd')) < (3, -3, (3, 'c'))
Out[46]: True
In [53]: heapq.nlargest(2, heap_arr, key=lambda a: (a[0], a))
Out[53]: [(3, 'd'), (3, 'c')]