Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/341.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/ssl/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python:比较两个字符串_Python_String - Fatal编程技术网

python:比较两个字符串

python:比较两个字符串,python,string,Python,String,我想知道是否有一个库可以告诉我这两个字符串有多相似 我不是在寻找任何具体的东西,但在这种情况下: a = 'alex is a buff dude' b = 'a;exx is a buff dud' 我们可以说b和a大约有90%相似 是否有可以执行此操作的库?查找用于比较字符串的算法。下面是一个通过谷歌找到的随机实现: 上有一些库,但请注意这是昂贵的,尤其是对于较长的字符串 您可能还想查看python的difflib:另一种方法是使用最长的公共子字符串。这里是Daniweb中的一个实现和我的

我想知道是否有一个库可以告诉我这两个字符串有多相似

我不是在寻找任何具体的东西,但在这种情况下:

a = 'alex is a buff dude'
b = 'a;exx is a buff dud'
我们可以说
b
a
大约有90%相似

是否有可以执行此操作的库?

查找用于比较字符串的算法。下面是一个通过谷歌找到的随机实现:

上有一些库,但请注意这是昂贵的,尤其是对于较长的字符串


您可能还想查看python的difflib:

另一种方法是使用最长的公共子字符串。这里是Daniweb中的一个实现和我的lcs实现(这也在difflib中定义)

以下是数据结构为列表的简单长度版本:

def longest_common_sequence(a,b):

    n1=len(a)
    n2=len(b)

    previous=[]
    for i in range(n2):
        previous.append(0)

    over = 0
    for ch1 in a:
        left = corner = 0
        for ch2 in b:
            over = previous.pop(0)
            if ch1 == ch2:
                this = corner + 1
            else:
                this = over if over >= left else left
            previous.append(this)
            left, corner = this, over
    return 200.0*previous.pop()/(n1+n2)
下面是我对deque数据结构的第二个介绍(也包括示例数据用例):


贵吗?与半体面的Levenshtein实现相比,difflib是一个怪物。我无意说difflib更便宜——它只是做了一件类似的事情,尽管有点不同。可能是
def longest_common_sequence(a,b):

    n1=len(a)
    n2=len(b)

    previous=[]
    for i in range(n2):
        previous.append(0)

    over = 0
    for ch1 in a:
        left = corner = 0
        for ch2 in b:
            over = previous.pop(0)
            if ch1 == ch2:
                this = corner + 1
            else:
                this = over if over >= left else left
            previous.append(this)
            left, corner = this, over
    return 200.0*previous.pop()/(n1+n2)
from collections import deque

a = 'alex is a buff dude'
b = 'a;exx is a buff dud'

def lcs_tuple(a,b):

    n1=len(a)
    n2=len(b)

    previous=deque()
    for i in range(n2):
        previous.append((0,''))

    over = (0,'')
    for i in range(n1):
        left = corner = (0,'')
        for j in range(n2):
            over = previous.popleft()
            if a[i] == b[j]:
                this = corner[0] + 1, corner[1]+a[i]
            else:
                this = max(over,left)
            previous.append(this)
            left, corner = this, over
    return 200.0*this[0]/(n1+n2),this[1]
print lcs_tuple(a,b)

""" Output:
(89.47368421052632, 'aex is a buff dud')
"""