Python 找到最有效的对组问题_Python_Algorithm_Combinatorics

Python 找到最有效的对组问题

python algorithm

Python 找到最有效的对组问题,python,algorithm,combinatorics,Python,Algorithm,Combinatorics,我有一个团队，我希望每个人都能与团队中的其他人进行1:1的会谈。一个给定的人一次只能与另一个人见面，因此我想执行以下操作：找到所有可能的配对组合将配对分组为“轮”会议，每个人只能参加一轮会议，一轮会议应包含尽可能多的配对，以在最少的轮数中满足所有可能的配对组合为了从期望的输入/输出方面演示问题，假设我有以下列表： >>> people = ['Dave', 'Mary', 'Susan', 'John'] 我希望生成以下输出： >>> for roun

我有一个团队，我希望每个人都能与团队中的其他人进行1:1的会谈。一个给定的人一次只能与另一个人见面，因此我想执行以下操作：

找到所有可能的配对组合

将配对分组为“轮”会议，每个人只能参加一轮会议，一轮会议应包含尽可能多的配对，以在最少的轮数中满足所有可能的配对组合

为了从期望的输入/输出方面演示问题，假设我有以下列表：

>>> people = ['Dave', 'Mary', 'Susan', 'John']

我希望生成以下输出：

>>> for round in make_rounds(people):
>>>     print(round)
[('Dave', 'Mary'), ('Susan', 'John')]
[('Dave', 'Susan'), ('Mary', 'John')]
[('Dave', 'John'), ('Mary', 'Susan')]

如果我的人数是奇数，那么我会期望得到这样的结果：

>>> people = ['Dave', 'Mary', 'Susan']
>>> for round in make_rounds(people):
>>>     print(round)
[('Dave', 'Mary')]
[('Dave', 'Susan')]
[('Mary', 'Susan')]

这个问题的关键是，我需要我的解决方案是合理的。我已经编写了一些有效的代码，但是随着

人

规模的增长，它的速度会呈指数级增长。我对编写性能算法了解不够，不知道我的代码是否效率低下，或者我是否仅仅受到问题参数的约束

我试过的步骤1很简单：我可以使用

itertools.combinations

获得所有可能的配对：

>>> from itertools import combinations
>>> people_pairs = set(combinations(people, 2))
>>> print(people_pairs)
{('Dave', 'Mary'), ('Dave', 'Susan'), ('Dave', 'John'), ('Mary', 'Susan'), ('Mary', 'John'), ('Susan', 'John')}

为了自己解决这些问题，我正在构建一个类似这样的循环：

创建一个空的

round

列表

迭代使用上面的

组合

方法计算的

人对

集合的副本

对于配对中的每个人，检查当前

轮中是否存在已包含该个人的现有配对


如果已经有一对包含其中一个个体，则跳过此轮的配对。如果没有，则将配对添加到轮次中，并将配对从人员配对
列表中删除
迭代所有人员对后，将轮添加到主rounds
列表中
重新开始，因为人对
现在只包含没有进入第一轮的人对
最终，这会产生期望的结果，并缩减我的人员对，直到没有剩余人员，所有轮次都计算出来。我已经看到这需要大量的迭代，但我不知道更好的方法
这是我的密码：
from itertools import combinations

# test if person already exists in any pairing inside a round of pairs
def person_in_round(person, round):
    is_in_round = any(person in pair for pair in round)
    return is_in_round

def make_rounds(people):
    people_pairs = set(combinations(people, 2))
    # we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty
    while people_pairs:
        round = []
        # make a copy of the current state of people_pairs to iterate over safely
        for pair in set(people_pairs):
            if not person_in_round(pair[0], round) and not person_in_round(pair[1], round):
                round.append(pair)
                people_pairs.remove(pair)
        yield round

使用绘制此方法在100-300人的列表中的性能表明，计算1000人的列表的轮次可能需要大约100分钟。有没有更有效的方法
注意：我实际上并不是在组织一个1000人的会议：）这只是一个简单的例子，代表了我试图解决的匹配/组合问题
 当您需要快速查找时，哈希/dict是最好的选择。用dict
而不是list
记录每一轮中谁参加过比赛，这样会快得多
由于您正在学习算法，学习大O表示法将帮助您解决问题，了解哪些数据结构擅长于哪种操作也是关键。请参阅本指南：了解Python内置的时间复杂性。您将看到，检查列表中的项目是O（n），这意味着它与输入的大小成线性比例。因此，因为它是在一个循环中，所以最终得到O（n^2）或更糟的结果。对于dicts，查找通常是O（1），这意味着它与输入的大小无关
另外，不要覆盖内置。我已将round
更改为round

from itertools import combinations

# test if person already exists in any pairing inside a round of pairs
def person_in_round(person, people_dict):
    return people_dict.get(person, False)

def make_rounds(people):
    people_pairs = set(combinations(people, 2))
    people_in_round = {}
    # we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty
    while people_pairs:
        round_ = []
        people_dict = {}
        # make a copy of the current state of people_pairs to iterate over safely
        for pair in set(people_pairs):
            if not person_in_round(pair[0], people_dict) and not person_in_round(pair[1], people_dict):
                round_.append(pair)
                people_dict[pair[0]] = True
                people_dict[pair[1]] = True


                people_pairs.remove(pair)
        yield round_

也许我遗漏了一些东西（并非完全罕见），但这听起来像是一场普通的循环赛，每支球队只与另一支球队比赛一次
有O（n^2）种方法“用手”处理这个问题，“用机器”就可以了。可以找到一个很好的描述
关于O（n^2）：将有n-1或n轮，每轮需要O（n）个步骤来旋转除一个表项外的所有表项，并且O（n）个步骤来枚举每轮中的n//2
匹配项。可以使用双链表进行旋转O（1），但匹配项的枚举仍然是O（n）。所以O（n）*O（n）=O（n^2）。
这在我的计算机上大约需要45秒
def make_rnds(people):
    people_pairs = set(combinations(people, 2))
    # we will remove pairings from people_pairs whilst we build rnds, so loop as long as people_pairs is not empty
    while people_pairs:
        rnd = []
        rnd_set = set()
        peeps = set(people)
        # make a copy of the current state of people_pairs to iterate over safely
        for pair in set(people_pairs):
            if pair[0] not in rnd_set and pair[1] not in rnd_set:
                rnd_set.update(pair)
                rnd.append(pair)

                peeps.remove(pair[0])
                peeps.remove(pair[1])

                people_pairs.remove(pair)
                if not peeps:
                    break
        yield rnd

为了减少函数调用的时间损失，我在rnd中删除了函数person\u，并添加了一个名为rnd\u set和peeps的变量。rnd_集合是迄今为止一轮中所有人的集合，用于检查与配对的匹配情况。peeps是一组复制的人，每次我们向rnd添加一对，我们都会从peeps中删除这些人。这让我们在peeps为空时，即在每个人都进入一轮时，停止迭代所有组合。
您可以立即做两件事：
不要每次都在列表中复制该集合。那是对时间/记忆的极大浪费。相反，在每次迭代后修改集合一次
在每一轮比赛中保持一组单独的人。在一组中查找一个人比在整个循环中查找要快一个数量级
例：
比较：
我只生成索引（因为我很难找到1000个名称=），但对于1000个数字，运行时间大约为4秒
所有其他方法的主要问题是——它们使用对并使用对，有很多对，而且运行时间越来越长。我的方法不同于与人合作，而不是与人结对。我有一个dict（）
，它将此人映射到他必须会见的其他人的列表，这些列表最多有N个项目（而不是N^2，如成对）。因此节省了时间
#!/usr/bin/env python

from itertools import combinations
from collections import defaultdict

pairs = combinations( range(6), 2 )

pdict = defaultdict(list)
for p in pairs :
    pdict[p[0]].append( p[1] )

while len(pdict) :
    busy = set()
    print '-----'
    for p0 in pdict :
        if p0 in busy : continue

        for p1 in pdict[p0] :
            if p1 in busy : continue

            pdict[p0].remove( p1 )
            busy.add(p0)
            busy.add(p1)
            print (p0, p1)

            break

    # remove empty entries
    pdict = { k : v for k,v in pdict.items() if len(v) > 0 }

'''
output:
-----
(0, 1)
(2, 3)
(4, 5)
-----
(0, 2)
(1, 3)
-----
(0, 3)
(1, 2)
-----
(0, 4)
(1, 5)
-----
(0, 5)
(1, 4)
-----
(2, 4)
(3, 5)
-----
(2, 5)
(3, 4)
'''

这是Wikipedia文章中描述的算法的实现
这花了大约4分钟完成1000次。这让我担心了一会儿。谢谢你叫我使用轮——我甚至都没意识到！没问题，这个问题应该作为如何在这个网站上提问的指南。真棒的问题@泽夫
#!/usr/bin/env python

from itertools import combinations
from collections import defaultdict

pairs = combinations( range(6), 2 )

pdict = defaultdict(list)
for p in pairs :
    pdict[p[0]].append( p[1] )

while len(pdict) :
    busy = set()
    print '-----'
    for p0 in pdict :
        if p0 in busy : continue

        for p1 in pdict[p0] :
            if p1 in busy : continue

            pdict[p0].remove( p1 )
            busy.add(p0)
            busy.add(p1)
            print (p0, p1)

            break

    # remove empty entries
    pdict = { k : v for k,v in pdict.items() if len(v) > 0 }

'''
output:
-----
(0, 1)
(2, 3)
(4, 5)
-----
(0, 2)
(1, 3)
-----
(0, 3)
(1, 2)
-----
(0, 4)
(1, 5)
-----
(0, 5)
(1, 4)
-----
(2, 4)
(3, 5)
-----
(2, 5)
(3, 4)
'''

from itertools import cycle , islice, chain

def round_robin(iterable):
    items = list(iterable)
    if len(items) % 2 != 0:
        items.append(None)
    fixed = items[:1]
    cyclers = cycle(items[1:])
    rounds = len(items) - 1
    npairs = len(items) // 2
    return [
        list(zip(
            chain(fixed, islice(cyclers, npairs-1)),
            reversed(list(islice(cyclers, npairs)))
        ))
        for _ in range(rounds)
        for _ in [next(cyclers)]
    ]