Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/algorithm/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/haskell/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何使用空格拟合字符串,最小化编辑距离?_Python_Algorithm_Edit Distance - Fatal编程技术网

Python 如何使用空格拟合字符串,最小化编辑距离?

Python 如何使用空格拟合字符串,最小化编辑距离?,python,algorithm,edit-distance,Python,Algorithm,Edit Distance,我正在寻找一种适合两个字符串的算法,在必要时用空格填充它们,以最小化它们之间的编辑距离: fit('algorithm', 'lgrthm') == ' lg r thm' 肯定有一些预先编写的算法。有什么想法吗?采取了一种天真而简单的逻辑方法 def fit(word1,word2): A, B = list(word1), list(word2) if len(B) < len(A): B+= (len(A)-len(B)) * ['1'] else:

我正在寻找一种适合两个字符串的算法,在必要时用空格填充它们,以最小化它们之间的编辑距离:

fit('algorithm', 'lgrthm') == ' lg r thm'

肯定有一些预先编写的算法。有什么想法吗?

采取了一种天真而简单的逻辑方法

def fit(word1,word2):

  A, B = list(word1), list(word2)

  if len(B) < len(A):
    B+= (len(A)-len(B)) * ['1']
  else:
    return ''.join(x if x in B else ' ' for x in A)

  for i in range(len(B)):
    if A[i] != B[i] : 
      B.insert(i,' ')
  return ''.join(x for x in B if x != '1')

您可以执行以下操作:

def fit(target, source):
    i, j = 0, 0
    result = []
    while i < len(source) and j < len(target):
        if source[i] == target[j]:
            result.append(source[i])
            i += 1
        else:
            result.append(' ')
        j += 1

    return ''.join(result)


test = [('algorithm', 'lgrthm'), ('pineapple', 'pine'), ('pineapple', 'apple'), ('pineapple', 'eale'),
        ('foo', 'fo'), ('stack', 'sak'), ('over', 'or'), ('flow', 'lw')]

for t, s in test:
    print(t)
    print(fit(t, s))
    print('---')
from collections import deque


def peak(q, default=' '):
    """Perform a safe peak, if the queue is empty return default"""
    return q[0] if q else default


def fit(target, source):
    ds = deque(source)
    return ''.join([ds.popleft() if peak(ds) == e else ' ' for e in target])
也许更好的版本如下:

def fit(target, source):
    i, j = 0, 0
    result = []
    while i < len(source) and j < len(target):
        if source[i] == target[j]:
            result.append(source[i])
            i += 1
        else:
            result.append(' ')
        j += 1

    return ''.join(result)


test = [('algorithm', 'lgrthm'), ('pineapple', 'pine'), ('pineapple', 'apple'), ('pineapple', 'eale'),
        ('foo', 'fo'), ('stack', 'sak'), ('over', 'or'), ('flow', 'lw')]

for t, s in test:
    print(t)
    print(fit(t, s))
    print('---')
from collections import deque


def peak(q, default=' '):
    """Perform a safe peak, if the queue is empty return default"""
    return q[0] if q else default


def fit(target, source):
    ds = deque(source)
    return ''.join([ds.popleft() if peak(ds) == e else ' ' for e in target])

更好的方法是不需要像前面的方法那样跟踪状态变量
i,j

您尝试过其他输入吗?当你放入fit(‘pine’、‘菠萝’)时会发生什么情况?你可以很容易地删除长度大小写并得到你想要的(例如,它(‘pine’、‘菠萝’=‘pine’)添加了第二个场景,我删除了单词length。当你做类似于“iapple”和“菠萝”的事情时会怎么样?我还想指出
difflib
SequenceMatcher
的潜在用途。
def fit(word1,word2):

  A, B = list(word1), list(word2)

  if len(B) < len(A):
    B+= (len(A)-len(B)) * ['1']
  else:
    return ''.join(x if x in B else ' ' for x in A)

  for i in range(len(B)):
    if A[i] != B[i] : 
      B.insert(i,' ')
  return ''.join(x for x in B if x != '1')