Python 使用迭代器连接列表中单词的两个部分_Python_String_List_Iterator_Concatenation

Python 使用迭代器连接列表中单词的两个部分

python string list

Python 使用迭代器连接列表中单词的两个部分,python,string,list,iterator,concatenation,Python,String,List,Iterator,Concatenation,我需要将某些在单词列表中分开显示的单词连接起来，例如“computer”（如下）。由于换行的原因，这些单词在列表中是分开的，我想解决这个问题 lst=['love'，'friend'，'apple'，'com'，'puter'] 预期结果是： lst=['love'、'friend'、'apple'、'computer'] 我的代码不起作用。有人能帮我吗我正在尝试的代码是： from collections import defaultdict import enchant import st

我需要将某些在单词列表中分开显示的单词连接起来，例如

“computer”

（如下）。由于换行的原因，这些单词在列表中是分开的，我想解决这个问题

lst=['love'，'friend'，'apple'，'com'，'puter']

预期结果是：

lst=['love'、'friend'、'apple'、'computer']

我的代码不起作用。有人能帮我吗

我正在尝试的代码是：

from collections import defaultdict
import enchant
import string
words=['love', 'friend', 'car', 'apple', 
'com', 'puter', 'vi']
myit = iter(words)
dic=enchant.Dict('en_UK')
lst=[]

errors=[]

for i in words:

   if  dic.check(i) is True:

      lst.append(i)
   if dic.check(i) is False:

      a= i + next(myit)

   if dic.check(a) is True:

      lst.append(a)

   else:

     continue



print (lst)`

代码的主要问题是，一方面，您在

for

循环中迭代

words

，另一方面，通过迭代器

myit

。这两个迭代是独立的，因此您不能在循环中使用

next（myit）

来获取

之后的单词（另外，如果

是最后一个单词，则不会有下一个单词）。另一方面，您的问题可能会因以下事实而变得复杂：字典中可能有拆分的单词，其部分太多（例如，

printable

是一个单词，但

print

和

able

也是一个单词）

假设一个简单的场景，拆分的单词部分永远不会出现在字典中，我认为这个算法对您来说会更好：

import enchant

words = ['love', 'friend', 'car', 'apple', 'com', 'puter', 'vi']
myit = iter(words)
dic = enchant.Dict('en_UK')
lst = []
# The word that you are currently considering
current = ''
for i in words:
    # Add the next word
    current += i
    # If the current word is in the dictionary
    if dic.check(current):
        # Add it to the list
        lst.append(current)
        # Clear the current word
        current = ''
    # If the word is not in the dictionary we keep adding words to current

print(lst)

代码的主要问题是，一方面，您在

for

循环中迭代

words

，另一方面，通过迭代器

myit

。这两个迭代是独立的，因此您不能在循环中使用

next（myit）

来获取

之后的单词（另外，如果

是最后一个单词，则不会有下一个单词）。另一方面，您的问题可能会因以下事实而变得复杂：字典中可能有拆分的单词，其部分太多（例如，

printable

是一个单词，但

print

和

able

也是一个单词）

假设一个简单的场景，拆分的单词部分永远不会出现在字典中，我认为这个算法对您来说会更好：

import enchant

words = ['love', 'friend', 'car', 'apple', 'com', 'puter', 'vi']
myit = iter(words)
dic = enchant.Dict('en_UK')
lst = []
# The word that you are currently considering
current = ''
for i in words:
    # Add the next word
    current += i
    # If the current word is in the dictionary
    if dic.check(current):
        # Add it to the list
        lst.append(current)
        # Clear the current word
        current = ''
    # If the word is not in the dictionary we keep adding words to current

print(lst)

尽管这种方法不是很健壮（例如，您可能会错过“ham burger”），但主要错误是您没有在迭代器上循环，而是在列表本身上循环。这是一个正确的版本

请注意，我重命名了变量以赋予它们更具表现力的名称，并用示例词汇表替换了dic中的一个简单的

单词，即dic

——您导入的模块不是标准库的一部分，这使得您的代码对于我们这些没有它的人来说很难运行

dic = {'love', 'friend', 'car', 'apple', 
       'computer', 'banana'}

words=['love', 'friend', 'car', 'apple', 'com', 'puter', 'vi']
words_it = iter(words)

valid_words = []

for word in words_it:
    if word in dic:
        valid_words.append(word)
    else:
        try:
            concacenated = word + next(words_it)
            if concacenated in dic:
                valid_words.append(concacenated)
        except StopIteration:
            pass

print (valid_words)
# ['love', 'friend', 'car', 'apple', 'computer']

你需要

试试。。。除了

部分，以防列表的最后一个单词不在词典中，因为在这种情况下，

next（）

将引发一个

StopIteration

。

尽管该方法不是很健壮（例如，您将错过“火腿汉堡”），但主要错误是您没有在迭代器上循环，而是在列表本身上循环。这是一个正确的版本

请注意，我重命名了变量以赋予它们更具表现力的名称，并用示例词汇表替换了dic中的一个简单的

单词，即dic

——您导入的模块不是标准库的一部分，这使得您的代码对于我们这些没有它的人来说很难运行

dic = {'love', 'friend', 'car', 'apple', 
       'computer', 'banana'}

words=['love', 'friend', 'car', 'apple', 'com', 'puter', 'vi']
words_it = iter(words)

valid_words = []

for word in words_it:
    if word in dic:
        valid_words.append(word)
    else:
        try:
            concacenated = word + next(words_it)
            if concacenated in dic:
                valid_words.append(concacenated)
        except StopIteration:
            pass

print (valid_words)
# ['love', 'friend', 'car', 'apple', 'computer']

你需要

试试。。。除了

部分，如果列表的最后一个单词不在词典中，因为在这种情况下，

next（）

将引发一个

StopIteration

。

您如何知道要连接哪些单词？“由于换行，这些单词在列表中显示为分开的，我想解决这个问题”--返回一个步骤并修复错误的换行代码“我的代码不起作用”可能更容易、更干净，您能更具体一些吗？如果你没有得到预期的输出，你得到了什么？如果您有错误，请将完整的错误回溯包含在question@Chris_Rands，我无法修复，因为当我从PDF导出到.txt文件时，这些单词是分开的。在PDF中用连字符分隔的单词现在在.txt中不用连字符分隔file@ThierryLathuille. 我认为当我尝试将单词的两个部分与“I+next（myit）”连接起来时，出现了一些错误。“您如何知道要连接哪些单词？”由于换行符，这些单词在列表中看起来是分开的，我想解决这个问题“--返回一个步骤并修复错误的换行代码‘我的代码不工作’可能会更容易，当然也更干净——你能更具体一点吗？如果你没有得到预期的输出，你得到了什么？如果您有错误，请将完整的错误回溯包含在question@Chris_Rands，我无法修复，因为当我从PDF导出到.txt文件时，这些单词是分开的。在PDF中用连字符分隔的单词现在在.txt中不用连字符分隔file@ThierryLathuille. 我认为当我尝试将单词的两个部分与'I+next（myit）'连接起来时，出现了一些错误。但是代码没有连接列表中的分隔单词，这是我的错误objective@NadiaSantos此代码应将连接的字放入

lst

（您一直在

current

中添加单词的部分，直到您在字典中找到完整的单词为止）。但是，如果它对您不起作用，您能否显示您得到的结果？（很遗憾，我无法在我的机器中安装pyenchant）。enchant是否可能认为

com

是有效的单词？（请参阅）.好的。但是代码没有连接列表中的分隔词，这是我的objective@NadiaSantos此代码应将连接的单词放在

lst

中（在

current

中不断添加部分单词，直到得到字典中的完整单词）。但是，如果它对您不起作用，您可以显示您得到的结果吗？（很遗憾，我无法在我的机器中安装pyenchant）。是否可以