Python函数，用于字符串在其字母排列列表中的位置'；s字符_Python_Python 3.x_Algorithm_Permutation

Python函数，用于字符串在其字母排列列表中的位置'；s字符

python python-3.x algorithm

Python函数，用于字符串在其字母排列列表中的位置'；s字符,python,python-3.x,algorithm,permutation,Python,Python 3.x,Algorithm,Permutation,我编写代码来解决以下问题，但是在最后两个测试用例中失败了。我用来解决这个问题的逻辑似乎是合理的，即使在让一位同事审查了它之后，我们都无法理解为什么它适用于前8个测试用例，而不适用于后两个（随机生成的）问题: 给定一个字符串，返回输入字符串在按字母顺序排列的列表，其中列出了该字符串中的字符。例如，ABAB的排列是 [AABB，ABAB，ABBA，BAAB，BABA，BBAA]其中ABAB的位置名单是2 我用来解决问题的逻辑：对于较大的输入，不可能（低效）生成排列列表，因此关键是在不生成字

我编写代码来解决以下问题，但是在最后两个测试用例中失败了。我用来解决这个问题的逻辑似乎是合理的，即使在让一位同事审查了它之后，我们都无法理解为什么它适用于前8个测试用例，而不适用于后两个（随机生成的）

问题: 给定一个字符串，返回输入字符串在按字母顺序排列的列表，其中列出了该字符串中的字符。例如，ABAB的排列是 [AABB，ABAB，ABBA，BAAB，BABA，BBAA]其中ABAB的位置名单是2

我用来解决问题的逻辑：对于较大的输入，不可能（低效）生成排列列表，因此关键是在不生成字母列表的情况下找到位置。这可以通过查找字符的频率来完成。在上面的例子中，ABAB中的第一个字符是A，因此before=0，after=0，before=5，before=6，before=6，因此将max减少.5*6，即3，对于minn为1，maxx为3的情况，只留下[AABB，ABAB，ABBA]，第一个字符是A！然后剩下的字符是BAB。minn=1，maxx=3，介于=3之间。因此，对于B，before将是.33，before将是0，因此对于minn为2，maxx为3的情况，将minn增加3*.33，这等于[ABAB，ABBA]将AB作为前两个字符的perm！对每个字符都这样做，它会在列表中找到输入

我的代码：导入进口经营者从收款进口柜台从数学导入阶乘从functools导入reduce ##主函数，返回列表位置 def列表位置（word）： #将字符串转换为数字列表，A为1，B为2，依此类推 val=[ord（char）-96表示word.lower（）中的字符] #结果必须介于1和排列数之间 minn=1 maxx=npermutations（字） #因此，我们只需根据频率之和增加最小值，减少最大值 #小于和大于每个字符的字符数对于范围内的indx（len（word））：介于=（maxx+1-minn）之间前，后=sumfreq（val[indx:]，val[indx]） minn=minn+int（四舍五入（（在*之前），0之间）） maxx=maxx-int（介于*之后）返回maxx#或minn并不重要。在这一点上他们是平等的 ##返回字符串的排列数（此操作有效）定义术语（word）： num=阶乘（len（word）） mults=计数器（word）.values（） den=reduce（operator.mul，（v在mults中的阶乘（v）），1）返回整数（num/den） ##以字符列表中字符的百分比形式返回频率 def频率（val，值）： f=[val.count（i）/len（val）表示值中的i] indx=值索引（值）返回f[indx] #返回所述字符<（之前）和>（之后）的所有字符的频率总和 def sumfreq（val，值）： before=[i中i的频率（val，i）[如果ivalue]] 返回和（之前），和（之后）测试=['A'、'ABAB'、'AAAB'、'BAAA'、'提问'、'簿记员'、'ABCABC'、'免疫电泳'、'ERATXOVFEXRCVW'、'GIZVEMHQWRLTBGESTZAHMHFBL'] 打印（listPosition（测试[0]），“应等于1”）打印（列表位置（测试[1]），“应等于2”）打印（列表位置（测试[2]），“应等于1”）打印（列表位置（测试[3]），“应等于4”）打印（列表位置（测试[4]），“应等于24572”）打印（列表位置（测试[5]），“应等于10743”）打印（列表位置（测试[6]），“应等于13”）打印（列表位置（测试[7]），“应等于718393983731145698173”）打印（listPosition（测试[8]），“应等于1083087583”）#减去一位？打印（listPosition（测试[9]），“应等于5587060423395426613071”）#过多？

正如@rici指出的，这是一个浮点错误（请参阅）。幸运的是，python有

分数
分数的两种明智用法。分数在不更改代码主体的情况下解决了问题，例如：
from fractions import Fraction
...
## returns the number of permutations for the string (this works)
def npermutations(word):
    num = factorial(len(word))
    mults = Counter(word).values()
    den = reduce(operator.mul, (factorial(v) for v in mults), 1)
    return int(Fraction(num, den))
## returns frequency as a percent for the character in the list of chars
def frequency(val,value):
    f = [Fraction(val.count(i),len(val)) for i in val]
    indx = val.index(value)
    return f[indx]
...

In []:
print(listPosition(tests[0]),"should equal 1")
print(listPosition(tests[1]),"should equal 2")
print(listPosition(tests[2]),"should equal 1")
print(listPosition(tests[3]),"should equal 4")
print(listPosition(tests[4]),"should equal 24572")
print(listPosition(tests[5]),"should equal 10743")
print(listPosition(tests[6]),"should equal 13")
print(listPosition(tests[7]),"should equal 718393983731145698173")
print(listPosition(tests[8]),"should equal 1083087583")
print(listPosition(tests[9]),"should equal 5587060423395426613071")

Out[]:
1 should equal 1
2 should equal 2
1 should equal 1
4 should equal 4
24572 should equal 24572
10743 should equal 10743
13 should equal 13
718393983731145698173 should equal 718393983731145698173
1083087583 should equal 1083087583
5587060423395426613071 should equal 5587060423395426613071


更新
根据@m69的出色解释，这里有一个更简单的实现：
from math import factorial
from collections import Counter
from functools import reduce
from operator import mul

def position(word):
    charset = Counter(word)
    pos = 1    # Per OP 1 index
    for letter in word:
        chars = sorted(charset)
        for char in chars[:chars.index(letter)]:
            ns = Counter(charset) - Counter([char])
            pos += factorial(sum(ns.values())) // reduce(mul, map(factorial, ns.values()))
        charset -= Counter([letter])
    return pos

得出与上述相同的结果：
In []:
tests = ['A', 'ABAB', 'AAAB', 'BAAA', 'QUESTION', 'BOOKKEEPER', 'ABCABC',
         'IMMUNOELECTROPHORETICALLY', 'ERATXOVFEXRCVW', 'GIZVEMHQWRLTBGESTZAHMHFBL']
print(position(tests[0]),"should equal 1")
print(position(tests[1]),"should equal 2")
print(position(tests[2]),"should equal 1")
print(position(tests[3]),"should equal 4")
print(position(tests[4]),"should equal 24572")
print(position(tests[5]),"should equal 10743")
print(position(tests[6]),"should equal 13")
print(position(tests[7]),"should equal 718393983731145698173")
print(position(tests[8]),"should equal 1083087583")
print(position(tests[9]),"should equal 5587060423395426613071")

Out[]:
1 should equal 1
2 should equal 2
1 should equal 1
4 should equal 4
24572 should equal 24572
10743 should equal 10743
13 should equal 13
718393983731145698173 should equal 718393983731145698173
1083087583 should equal 1083087583
5587060423395426613071 should equal 5587060423395426613071

正如@rici指出的，这是一个浮点错误（请参阅）。幸运的是，python有分数
分数的两种明智用法。分数在不更改代码主体的情况下解决了问题，例如：
from fractions import Fraction
...
## returns the number of permutations for the string (this works)
def npermutations(word):
    num = factorial(len(word))
    mults = Counter(word).values()
    den = reduce(operator.mul, (factorial(v) for v in mults), 1)
    return int(Fraction(num, den))
## returns frequency as a percent for the character in the list of chars
def frequency(val,value):
    f = [Fraction(val.count(i),len(val)) for i in val]
    indx = val.index(value)
    return f[indx]
...

In []:
print(listPosition(tests[0]),"should equal 1")
print(listPosition(tests[1]),"should equal 2")
print(listPosition(tests[2]),"should equal 1")
print(listPosition(tests[3]),"should equal 4")
print(listPosition(tests[4]),"should equal 24572")
print(listPosition(tests[5]),"should equal 10743")
print(listPosition(tests[6]),"should equal 13")
print(listPosition(tests[7]),"should equal 718393983731145698173")
print(listPosition(tests[8]),"should equal 1083087583")
print(listPosition(tests[9]),"should equal 5587060423395426613071")

Out[]:
1 should equal 1
2 should equal 2
1 should equal 1
4 should equal 4
24572 should equal 24572
10743 should equal 10743
13 should equal 13
718393983731145698173 should equal 718393983731145698173
1083087583 should equal 1083087583
5587060423395426613071 should equal 5587060423395426613071


更新
根据@m69的出色解释，这里有一个更简单的实现：
from math import factorial
from collections import Counter
from functools import reduce
from operator import mul

def position(word):
    charset = Counter(word)
    pos = 1    # Per OP 1 index
    for letter in word:
        chars = sorted(charset)
        for char in chars[:chars.index(letter)]:
            ns = Counter(charset) - Counter([char])
            pos += factorial(sum(ns.values())) // reduce(mul, map(factorial, ns.values()))
        charset -= Counter([letter])
    return pos

得出与上述相同的结果：
In []:
tests = ['A', 'ABAB', 'AAAB', 'BAAA', 'QUESTION', 'BOOKKEEPER', 'ABCABC',
         'IMMUNOELECTROPHORETICALLY', 'ERATXOVFEXRCVW', 'GIZVEMHQWRLTBGESTZAHMHFBL']
print(position(tests[0]),"should equal 1")
print(position(tests[1]),"should equal 2")
print(position(tests[2]),"should equal 1")
print(position(tests[3]),"should equal 4")
print(position(tests[4]),"should equal 24572")
print(position(tests[5]),"should equal 10743")
print(position(tests[6]),"should equal 13")
print(position(tests[7]),"should equal 718393983731145698173")
print(position(tests[8]),"should equal 1083087583")
print(position(tests[9]),"should equal 5587060423395426613071")

Out[]:
1 should equal 1
2 should equal 2
1 should equal 1
4 should equal 4
24572 should equal 24572
10743 should equal 10743
13 should equal 13
718393983731145698173 should equal 718393983731145698173
1083087583 should equal 1083087583
5587060423395426613071 should equal 5587060423395426613071

您可以使用只需要整数运算的逻辑。首先，按字典顺序创建第一个排列：
BOOKKEEPER  ->  BEEEKKOOPR

然后，对于每个字母，您可以计算将其移动到其位置所需的唯一排列数。由于第一个字母B已经存在，我们可以忽略它，看看其余的字母：
B EEEKKOOPR  (first)
B OOKKEEPER  (target)

为了知道把O带到前面需要多少个置换，我们计算前面有E，然后前面有K的唯一置换的数量：
E+EEKKOOPR -> 8! / (2! * 2! * 2!) = 40320 /  8 = 5040
K+EEEKOOPR -> 8! / (3! * 2!)      = 40320 / 12 = 3360

其中8是要排列的字母数，2和3是字母的倍数。因此，经过8400次排列后，我们处于：
BO EEEKKOPR

现在，我们再次计算第二个O到前面需要多少排列：
E+EEKKOPR -> 7! / (2! * 2!) = 5040 / 4 = 1260
K+EEEKOPR -> 7! / (3!)      = 5040 / 6 =  840

E+EEKKPR -> 6! / (2! * 2!) = 720 / 4 = 180

E+EEKPR -> 5! / 2! = 120 / 2 = 60

E+PR -> 2! = 2

因此，在10500次排列之后，我们处于：
BOO EEEKKPR

BOOK EEEKPR

BOOKK EEEPR

BOOKKEEP ER

然后我们计算将K带到前面需要多少个置换：
E+EEKKOPR -> 7! / (2! * 2!) = 5040 / 4 = 1260
K+EEEKOPR -> 7! / (3!)      = 5040 / 6 =  840

E+EEKKPR -> 6! / (2! * 2!) = 720 / 4 = 180

E+EEKPR -> 5! / 2! = 120 / 2 = 60

E+PR -> 2! = 2

因此，在10680次排列之后，我们处于：
BOO EEEKKPR

BOOK EEEKPR

BOOKK EEEPR

BOOKKEEP ER

然后我们计算将第二个K带到前面所需的排列数：
E+EEKKOPR -> 7! / (2! * 2!) = 5040 / 4 = 1260
K+EEEKOPR -> 7! / (3!)      = 5040 / 6 =  840

E+EEKKPR -> 6! / (2! * 2!) = 720 / 4 = 180

E+EEKPR -> 5! / 2! = 120 / 2 = 60

E+PR -> 2! = 2

因此，在10740次排列之后，我们处于：
BOO EEEKKPR

BOOK EEEKPR

BOOKK EEEPR

BOOKKEEP ER

接下来的两个字母已经准备好了，所以我们可以跳到：
BOOKKEE EPR

然后，我们计算将p移到前面所需的置换次数：
E+EEKKOPR -> 7! / (2! * 2!) = 5040 / 4 = 1260
K+EEEKOPR -> 7! / (3!)      = 5040 / 6 =  840

E+EEKKPR -> 6! / (2! * 2!) = 720 / 4 = 180

E+EEKPR -> 5! / 2! = 120 / 2 = 60

E+PR -> 2! = 2

因此，在10742次排列之后，我们处于：
BOO EEEKKPR

BOOK EEEKPR

BOOKK EEEPR

BOOKKEEP ER

最后两封信也已经准备好了