Python 如何检测相似的无序序列？_Python_Regex_String_List_Sequence

Python 如何检测相似的无序序列？

python regex string list

Python 如何检测相似的无序序列？,python,regex,string,list,sequence,Python,Regex,String,List,Sequence,我想在道路网中找到类似的交叉口。我的诀窍是找到最相似的街道名称序列。我已经创建了几个名字列表。其中一个是参考，另外两个是对应的。我想找到一个有相同的街道名称和相同的发生次数有必要知道，名字的顺序并不重要，重要的只是相似名字出现的次数例如：引用名称序列： [u'Barytongatan'，u'Tunnlandsgatan'，u'Barytongatan'] 邻居对应的名称顺序为： {91:[u'Tunnlandsgatan'，u'Tunnlandsgatan'，u'Barytongatan'

我想在道路网中找到类似的交叉口。我的诀窍是找到最相似的街道名称序列。我已经创建了几个名字列表。其中一个是参考，另外两个是对应的。我想找到一个有相同的街道名称和相同的发生次数

有必要知道，名字的顺序并不重要，重要的只是相似名字出现的次数

例如：

引用名称序列：

[u'Barytongatan'，u'Tunnlandsgatan'，u'Barytongatan']

邻居对应的名称顺序为：

{91:[u'Tunnlandsgatan'，u'Tunnlandsgatan'，u'Barytongatan']，142:[u'Tunnlandsgatan'，u'Tunnlandsgatan'，u']]

首先，我需要知道这个问题是否已经有了解决方案。第二，选择列表作为序列的容器是个好主意吗？最后，如果是这样，如何解决

我想到了正则表达式，但似乎没有什么用处，因为顺序是不固定的。

如果您创建每个键出现的映射，然后在检查引用数组后减去出现的值，那么您可以确保得到正确的答案，即使映射中的数组顺序不正确

reference = [u'Barytongatan', u'Tunnlandsgatan', u'Barytongatan']
sequence = {91: [u'Barytongatan', u'Tunnlandsgatan', u'Barytongatan'], 142: [u'Tunnlandsgatan', u'Tunnlandsgatan', u' ']}
def getMatching(reference, sequence):
    for value in sequence.values():
        tempMap = {}
        for v in value:
            try:
                tempMap[v] += 1
            except KeyError:
                tempMap[v] = 1

        # tempMap now contains a map of the each element in the array and their occurance in the array
        for v in reference:
            try:
                # Everytime we find this reference in the 'reference' list, subtract one from the occurance
                tempMap[v] -= 1
            except:
                pass

        # Loop through each value in the map, and make sure the occurrence is 0
        for v in tempMap.values():
            if v != 0:
                break
        else:
            # This else statement is for the for loop, if the else fires, then all the values were 0
            return value
        continue
    return None

print getMatching(reference, sequence) # Prints [u'Barytongatan', u'Tunnlandsgatan', u'Barytongatan']

现在，如果使用此选项，它仍然可以工作：

reference = [u'Barytongatan', u'Tunnlandsgatan', u'Barytongatan']
sequence = {142: [u'Tunnlandsgatan', u'Tunnlandsgatan', u' '], 91: [u'Barytongatan', u'Barytongatan', u'Tunnlandsgatan']}
print getMatching(reference, sequence) # Prints [u'Barytongatan', u'Barytongatan', u'Tunnlandsgatan'] even though they are not in the same order as reference

这里有一点是，列表中的顺序可能不一样，因此字典中的值与引用值相等将不是答案。我想知道这个快速下降投票是为了什么？