Python 两个字符串（句子）之间的差异_Python_String

Python 两个字符串（句子）之间的差异

python string

Python 两个字符串（句子）之间的差异,python,string,Python,String,我试图计算两个句子之间的差异，如下所示： import difflib text1_lines = "I understand how customers do their choice. Difference" text2_lines = "I understand how customers do their choice." diff = difflib.ndiff(text1_lines, text2_lines) 我想改变一下但我不明白。我做错了什么？谢谢您让我知道。将较大的字符

我试图计算两个句子之间的差异，如下所示：

import difflib

text1_lines = "I understand how customers do their choice. Difference"
text2_lines = "I understand how customers do their choice."
diff = difflib.ndiff(text1_lines, text2_lines)

我想改变一下

但我不明白。我做错了什么？谢谢您让我知道。

将较大的字符串拆分为较小的字符串，您将得到差异

if len(a) == 0:
   print b
   return
if len(b) == 0:
   print a
   return
if len(a)>len(b): 
   res=''.join(a.split(b))             #get diff
else: 
   res=''.join(b.split(a))             #get diff

print(res.strip())

从：

输出：

*** 
--- 
***************
*** 41,54 ****
c  e  .-  - D- i- f- f- e- r- e- n- c- e--- 41,43 ----

['-  ', '- D', '- i', '- f', '- f', '- e', '- r', '- e', '- n', '- c', '- e']

使用简单的列表理解：

diff = [x for x in difflib.ndiff(text1_lines, text2_lines) if x[0] != ' ']

它将向您显示删除和增补

输出：

*** 
--- 
***************
*** 41,54 ****
c  e  .-  - D- i- f- f- e- r- e- n- c- e--- 41,43 ----

['-  ', '- D', '- i', '- f', '- f', '- e', '- r', '- e', '- n', '- c', '- e']

（后面带负号的所有内容都已删除）

相反，切换

text1_行

和

text2_行

将产生以下结果：

['+  ', '+ D', '+ i', '+ f', '+ f', '+ e', '+ r', '+ e', '+ n', '+ c', '+ e']

要删除符号，可以转换上述列表：

diff_nl = [x[2] for x in diff]

要完全转换为字符串，只需使用

.join（）

：

使用实际的

difflib

，您可以这样做。问题是你得到了一个生成器，它有点像一个打包的for循环，解包的唯一方法就是迭代它

import difflib
text1_lines = "I understand how customers do their choice. Difference"
text2_lines = "I understand how customers do their choice."
diff = difflib.unified_diff(text1_lines, text2_lines)

unified_diff

与

ndiff

的不同之处在于，它只显示不同之处，而as

ndiff

则显示相似之处和不同之处

diff

现在是一个生成器对象，剩下要做的就是将其解压缩

n = 0
result = ''
for difference in diff:
    n += 1
    if n < 7: # the first 7 lines is a bunch of information unnecessary for waht you want
        continue
    result += difference[1] # the character at this point will either be " x", "-x" or "+x"

你有错误吗？您当前的输出是什么？为什么不使用set difference来计算字符串@henry

splitA=set（a.split（“”）和splitB=set（b.split（“”））之间的不常见单词的可能重复项是您的意思编辑nvm更改答案非常好！！非常感谢。你可以添加它作为基本条件，更新答案，如果两个字符串都为空，这将打印任何内容。谢谢你的答案。如何获得简单：没有所有“+”符号等的差异？diff_nl=[x[2]表示diff中的x]
如果你想忽略符号，也许你可以使用set Difference@henry？diff_nl='.[x[2]表示diff中的x]）表示没有列表的纯字符串。谢谢！谢谢你的回答。我怎样才能明白：“不同”没有所有额外的东西，标志等？
>>> result
' Difference'