Python—通过列表进行求和和和分组_Python_Sorting

Python—通过列表进行求和和和分组

python sorting

Python—通过列表进行求和和和分组,python,sorting,Python,Sorting,我有一个很大的数字列表，如下所示： a = [133000, 126000, 123000, 108000, 96700, 96500, 93800, 93200, 92100, 90000, 88600, 87000, 84300, 82400, 80700, 79900, 79000, 78800, 76100, 75000, 15300, 15200, 15100, 8660, 8640, 8620, 8530, 2590, 2590, 2580, 2550, 2540, 2540

我有一个很大的数字列表，如下所示：

a = [133000, 126000, 123000, 108000, 96700, 96500, 93800, 
 93200, 92100, 90000, 88600, 87000, 84300, 82400, 80700,
 79900, 79000, 78800, 76100, 75000, 15300, 15200, 15100,
 8660, 8640, 8620, 8530, 2590, 2590, 2580, 2550, 2540, 2540, 
 2510, 2510, 1290, 1280, 1280, 1280, 1280, 951, 948, 948,
 947, 946, 945, 609, 602, 600, 599, 592, 592, 592, 591, 583]

我想做的是逐个循环浏览这个列表，检查一个值是否高于某个阈值（例如40000）。如果它高于此阈值，我们将该值放入一个新列表中，并将其忘掉。否则，我们将等待值的总和超过阈值，然后将值放入列表中，然后继续循环。最后，如果最终值的总和不到阈值，我们只需将它们添加到最后一个列表中

如果我不清楚，考虑一个简单的例子，阈值为15＜/p>

[20, 10, 9, 8, 8, 7, 6, 2, 1]

最终列表应如下所示：

[[20], [10, 9], [8, 8], [7, 6, 2, 1]]

我在数学和python方面真的很差，我已经不知所措了。我提出了一些基本代码，但实际上不起作用：

def sortthislist(list):
    list = a
    newlist = []
    for i in range(len(list)):
        while sum(list[i]) >= 40000:
            newlist.append(list[i])
    return newlist

任何帮助都将不胜感激。很抱歉写了这么长的文章。

下面的函数将接受您的输入列表和一些检查限制，然后输出已排序的列表：

a = [20, 10, 9, 8, 8, 7, 6, 2, 1]

def func(a, lim):
    out = []
    temp = []
    for i in a:
        if i > lim:
            out.append([i])
        else:
            temp.append(i)
            if sum(temp) > lim:
                out.append(temp)
                temp = []
    return out

print(func(a, 15))
# [[20], [10, 9], [8, 8], [7, 6, 2, 1]]

使用Python，您可以对列表本身进行迭代，而不是对其索引进行迭代，因此您可以看到我对a中的I使用

，而不是对范围（len（a））中的I使用

在函数out
中，是要在末尾返回的列表temp
是一个临时列表，在temp
的总和超过您的lim
值之前，该temp
将被追加到out
并替换为空列表。
希望这会有所帮助：）
def group(L, threshold):
    answer = []
    start = 0
    sofar = L[0]
    for i,num in enumerate(L[1:],1):
        if sofar >= threshold:
            answer.append(L[start:i])
            sofar = L[i]
            start = i
        else:
            sofar += L[i]
    if i<len(L) and sofar>=threshold:
        answer.append(L[i:])
    return answer

结果是：
[[20], [10, 3, 9], [7, 6, 5], [4]]

还有另一种方法：
def group_by_sum(a, lim):
    out = []
    group = None
    for i in a:
        if group is None:
            group = []
            out.append(group)

        group.append(i)

        if sum(group) > lim:
            group = None
    return out

print(group_by_sum(a, 15))

我们已经有了很多可行的答案，但这里还有两种方法
我们可以使用来收集这样的组，因为有一个了解组内容的有状态累加器。我们最终得到一组（键，组）对，因此一些额外的过滤只会得到组。此外，由于itertools提供迭代器，我们将它们转换为列表以进行打印
from itertools import groupby

class Thresholder:
  def __init__(self, threshold):
    self.threshold=threshold
    self.sum=0
    self.group=0
  def __call__(self, value):
    if self.sum>self.threshold:
      self.sum=value
      self.group+=1
    else:
      self.sum+=value
    return self.group
print [list(g) for k,g in groupby([20, 10, 9, 8, 8, 7, 6, 2, 1], Thresholder(15))]

该操作也可以作为单个reduce调用完成：
def accumulator(result, value):
  last=result[-1]
  if sum(last)>threshold:
    result.append([value])
  else:
    last.append(value)
  return result
threshold=15
print reduce(accumulator, [20, 10, 9, 8, 8, 7, 6, 2, 1], [[]])

由于重复调用sum（），此版本无法扩展到多个值，阈值的全局变量相当笨拙。此外，调用它以获得一个空列表仍然会留下一个空组
编辑：问题逻辑要求将高于阈值的值放入它们自己的组中（不与收集的较小值共享）。在编写这些版本时，我没有想到这一点，但Ffisegydd接受的答案解决了这一问题。如果输入数据按降序排序，则没有有效的差异，因为所有样本数据似乎都是降序的 您使用的是什么版本的Python？
from itertools import groupby

class Thresholder:
  def __init__(self, threshold):
    self.threshold=threshold
    self.sum=0
    self.group=0
  def __call__(self, value):
    if self.sum>self.threshold:
      self.sum=value
      self.group+=1
    else:
      self.sum+=value
    return self.group
print [list(g) for k,g in groupby([20, 10, 9, 8, 8, 7, 6, 2, 1], Thresholder(15))]

def accumulator(result, value):
  last=result[-1]
  if sum(last)>threshold:
    result.append([value])
  else:
    last.append(value)
  return result
threshold=15
print reduce(accumulator, [20, 10, 9, 8, 8, 7, 6, 2, 1], [[]])