Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/283.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python heapq排序列表错误?_Python_Sorting - Fatal编程技术网

python heapq排序列表错误?

python heapq排序列表错误?,python,sorting,Python,Sorting,我试图将列表排序为一个列表,其中包含章节、子章节和子章节的编号和名称。程序如下所示: import heapq sections = ['1. Section', '2. Section', '3. Section', '4. Section', '5. Section', '6. Section', '7. Section', '8. Section', '9. Section', '10. Section', '11. Section', '12. Section'] subsection

我试图将列表排序为一个列表,其中包含章节、子章节和子章节的编号和名称。程序如下所示:

import heapq

sections = ['1. Section', '2. Section', '3. Section', '4. Section', '5. Section', '6. Section', '7. Section', '8. Section', '9. Section', '10. Section', '11. Section', '12. Section']
subsections = ['1.1 Subsection', '1.2 Subsection', '1.3 Subsection', '1.4 Subsection', '2.1 Subsection', '4.1 My subsection', '7.1 Subsection', '8.1 Subsection', '12.1 Subsection']
subsubsections = ['1.2.1 Subsubsection', '1.2.2 Subsubsection', '1.4.1 Subsubsection', '2.1.1 Subsubsection', '7.1.1 Subsubsection', '8.1.1 Subsubsection', '12.1.1 Subsubsection']

sorted_list = list(heapq.merge(sections, subsections, subsubsections))

print(sorted_list)
我得到的是:

['1. Section', '1.1 Subsection', '1.2 Subsection', '1.2.1 Subsubsection', '1.2.2 Subsubsection', '1.3 Subsection', '1.4 Subsection', '1.4.1 Subsubsection', '2. Section', '2.1 Subsection', '2.1.1 Subsubsection', '3. Section', '4. Section', '4.1 My subsection', '5. Section', '6. Section', '7. Section', '7.1 Subsection', '7.1.1 Subsubsection', '8. Section', '8.1 Subsection', '12.1 Subsection', '8.1.1 Subsubsection', '12.1.1 Subsubsection', '9. Section', '10. Section', '11. Section', '12. Section']
我的第12小节和第8小节位于第8小节,而不是第12小节

为什么会这样?最初的列表被排序,一切都很顺利,显然排到了第10位

我不知道为什么会发生这种情况,有没有办法根据列表中的数字更好地将其分类到“树”中?我正在构建一个目录,它将返回(一旦我过滤掉列表)


请注意8.1小节后面的12.1小节和8.1.1小节后面的12.1.1子小节。

您的列表在人眼看来可能会排序。但是对于Python,您的输入不是完全排序的,因为它按字典顺序对字符串进行排序。这意味着
'12'
按排序顺序排在
'8'
之前,因为只比较第一个字符

因此,合并是完全正确的;在看到
'8.1'
字符串后,会遇到以
'12.1'
开头的字符串,但以
'8.1.1'
开头的字符串随后会被排序

必须使用键函数从字符串中提取整数元组才能正确排序:

section = lambda s: [int(d) for d in s.partition(' ')[0].split('.') if d]
heapq.merge(sections, subsections, subsubsections, key=section))
请注意,
参数仅在Python 3.5及更高版本中可用;在早期的版本中,你必须进行手工装饰合并未装饰的舞蹈

演示(使用Python 3.6):

键控合并很容易后端口到Python 3.3和3.4:

import heapq

def _heappop_max(heap):
    lastelt = heap.pop()
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        heapq._siftup_max(heap, 0)
        return returnitem
    return lastelt

def _heapreplace_max(heap, item):
    returnitem = heap[0]
    heap[0] = item
    heapq._siftup_max(heap, 0)
    return returnitem

def merge(*iterables, key=None, reverse=False):    
    h = []
    h_append = h.append

    if reverse:
        _heapify = heapq._heapify_max
        _heappop = _heappop_max
        _heapreplace = _heapreplace_max
        direction = -1
    else:
        _heapify = heapify
        _heappop = heappop
        _heapreplace = heapreplace
        direction = 1

    if key is None:
        for order, it in enumerate(map(iter, iterables)):
            try:
                next = it.__next__
                h_append([next(), order * direction, next])
            except StopIteration:
                pass
        _heapify(h)
        while len(h) > 1:
            try:
                while True:
                    value, order, next = s = h[0]
                    yield value
                    s[0] = next()           # raises StopIteration when exhausted
                    _heapreplace(h, s)      # restore heap condition
            except StopIteration:
                _heappop(h)                 # remove empty iterator
        if h:
            # fast case when only a single iterator remains
            value, order, next = h[0]
            yield value
            yield from next.__self__
        return

    for order, it in enumerate(map(iter, iterables)):
        try:
            next = it.__next__
            value = next()
            h_append([key(value), order * direction, value, next])
        except StopIteration:
            pass
    _heapify(h)
    while len(h) > 1:
        try:
            while True:
                key_value, order, value, next = s = h[0]
                yield value
                value = next()
                s[0] = key(value)
                s[2] = value
                _heapreplace(h, s)
        except StopIteration:
            _heappop(h)
    if h:
        key_value, order, value, next = h[0]
        yield value
        yield from next.__self__
装饰排序-取消装饰合并非常简单,如下所示:

def decorate(iterable, key):
    for elem in iterable:
        yield key(elem), elem

sorted = [v for k, v in heapq.merge(
    decorate(sections, section), decorate(subsections, section)
    decorate(subsubsections, section))]
因为您的输入已经排序,所以使用合并排序更有效。最后,您可以使用
sorted()
但是:

from itertools import chain
result = sorted(chain(sections, subsections, subsubsections), key=section)

您的列表在人眼看来可能会排序。但是对于Python,您的输入不是完全排序的,因为它按字典顺序对字符串进行排序。这意味着
'12'
按排序顺序排在
'8'
之前,因为只比较第一个字符

因此,合并是完全正确的;在看到
'8.1'
字符串后,会遇到以
'12.1'
开头的字符串,但以
'8.1.1'
开头的字符串随后会被排序

必须使用键函数从字符串中提取整数元组才能正确排序:

section = lambda s: [int(d) for d in s.partition(' ')[0].split('.') if d]
heapq.merge(sections, subsections, subsubsections, key=section))
请注意,
参数仅在Python 3.5及更高版本中可用;在早期的版本中,你必须进行手工装饰合并未装饰的舞蹈

演示(使用Python 3.6):

键控合并很容易后端口到Python 3.3和3.4:

import heapq

def _heappop_max(heap):
    lastelt = heap.pop()
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        heapq._siftup_max(heap, 0)
        return returnitem
    return lastelt

def _heapreplace_max(heap, item):
    returnitem = heap[0]
    heap[0] = item
    heapq._siftup_max(heap, 0)
    return returnitem

def merge(*iterables, key=None, reverse=False):    
    h = []
    h_append = h.append

    if reverse:
        _heapify = heapq._heapify_max
        _heappop = _heappop_max
        _heapreplace = _heapreplace_max
        direction = -1
    else:
        _heapify = heapify
        _heappop = heappop
        _heapreplace = heapreplace
        direction = 1

    if key is None:
        for order, it in enumerate(map(iter, iterables)):
            try:
                next = it.__next__
                h_append([next(), order * direction, next])
            except StopIteration:
                pass
        _heapify(h)
        while len(h) > 1:
            try:
                while True:
                    value, order, next = s = h[0]
                    yield value
                    s[0] = next()           # raises StopIteration when exhausted
                    _heapreplace(h, s)      # restore heap condition
            except StopIteration:
                _heappop(h)                 # remove empty iterator
        if h:
            # fast case when only a single iterator remains
            value, order, next = h[0]
            yield value
            yield from next.__self__
        return

    for order, it in enumerate(map(iter, iterables)):
        try:
            next = it.__next__
            value = next()
            h_append([key(value), order * direction, value, next])
        except StopIteration:
            pass
    _heapify(h)
    while len(h) > 1:
        try:
            while True:
                key_value, order, value, next = s = h[0]
                yield value
                value = next()
                s[0] = key(value)
                s[2] = value
                _heapreplace(h, s)
        except StopIteration:
            _heappop(h)
    if h:
        key_value, order, value, next = h[0]
        yield value
        yield from next.__self__
装饰排序-取消装饰合并非常简单,如下所示:

def decorate(iterable, key):
    for elem in iterable:
        yield key(elem), elem

sorted = [v for k, v in heapq.merge(
    decorate(sections, section), decorate(subsections, section)
    decorate(subsubsections, section))]
因为您的输入已经排序,所以使用合并排序更有效。最后,您可以使用
sorted()
但是:

from itertools import chain
result = sorted(chain(sections, subsections, subsubsections), key=section)

正如在其他答案中所解释的,您必须指定一个排序方法,否则python将按字典顺序对字符串进行排序。如果您使用的是python 3.5+,则可以在
merge
函数中使用
key
参数,在python 3.5中,您可以使用
itertools.chain
sorted
,作为一种通用方法,您可以使用regex来查找数字并将其转换为int:

In [18]: from itertools import chain
In [19]: import re
In [23]: sorted(chain.from_iterable((sections, subsections, subsubsections)),
                key = lambda x: [int(i) for i in re.findall(r'\d+', x)])
Out[23]: 
['1. Section',
 '1.1 Subsection',
 '1.2 Subsection',
 '1.2.1 Subsubsection',
 '1.2.2 Subsubsection',
 '1.3 Subsection',
 '1.4 Subsection',
 '1.4.1 Subsubsection',
 '2. Section',
 '2.1 Subsection',
 '2.1.1 Subsubsection',
 '3. Section',
 '4. Section',
 '4.1 My subsection',
 '5. Section',
 '6. Section',
 '7. Section',
 '7.1 Subsection',
 '7.1.1 Subsubsection',
 '8. Section',
 '8.1 Subsection',
 '8.1.1 Subsubsection',
 '9. Section',
 '10. Section',
 '11. Section',
 '12. Section',
 '12.1 Subsection',
 '12.1.1 Subsubsection']

正如在其他答案中所解释的,您必须指定一个排序方法,否则python将按字典顺序对字符串进行排序。如果您使用的是python 3.5+,则可以在
merge
函数中使用
key
参数,在python 3.5中,您可以使用
itertools.chain
sorted
,作为一种通用方法,您可以使用regex来查找数字并将其转换为int:

In [18]: from itertools import chain
In [19]: import re
In [23]: sorted(chain.from_iterable((sections, subsections, subsubsections)),
                key = lambda x: [int(i) for i in re.findall(r'\d+', x)])
Out[23]: 
['1. Section',
 '1.1 Subsection',
 '1.2 Subsection',
 '1.2.1 Subsubsection',
 '1.2.2 Subsubsection',
 '1.3 Subsection',
 '1.4 Subsection',
 '1.4.1 Subsubsection',
 '2. Section',
 '2.1 Subsection',
 '2.1.1 Subsubsection',
 '3. Section',
 '4. Section',
 '4.1 My subsection',
 '5. Section',
 '6. Section',
 '7. Section',
 '7.1 Subsection',
 '7.1.1 Subsubsection',
 '8. Section',
 '8.1 Subsection',
 '8.1.1 Subsubsection',
 '9. Section',
 '10. Section',
 '11. Section',
 '12. Section',
 '12.1 Subsection',
 '12.1.1 Subsubsection']

因为它是按字符串的字典顺序操作的,而不是你的版本作为“数字”…因为它是按字符串的字典顺序操作的,而不是你的版本作为“数字”…那么有没有其他排序算法可以代替堆来做这件事呢?我正在为sublime text 3做一个插件,所以它使用的是Python 3,但我不确定是哪一个正确尝试了它,肯定是版本<3.5,因为我得到了
TypeError:merge()得到了一个意外的关键字参数“key”
@dingo\u d sublime是Python 3.3,所以你必须输入元组;第一个元素是section函数的输出,第二个元素是原始字符串。然后您可以合并,然后提取。您也可以使用
sorted()
函数而不是合并。感谢您的帮助,Kasramvd的答案成功了,所以我接受他的答案+1作为解释:)那么有没有其他排序算法可以代替堆来完成这个任务呢?我正在为sublime text 3做一个插件,所以它使用的是Python 3,但我不确定是哪一个正确尝试了它,肯定是版本<3.5,因为我得到了
TypeError:merge()得到了一个意外的关键字参数“key”
@dingo\u d sublime是Python 3.3,所以你必须输入元组;第一个元素是section函数的输出,第二个元素是原始字符串。然后您可以合并,然后提取。您也可以使用
sorted()
函数而不是合并。感谢您的帮助,Kasramvd的答案成功了,所以我接受他的答案+1解释如下:)