Parsing 获取包含列表[i+；1]的子列表，其中列表[i]是特定值_Parsing_Python 3.x

Parsing 获取包含列表[i+；1]的子列表，其中列表[i]是特定值

parsing python-3.x

Parsing 获取包含列表[i+；1]的子列表，其中列表[i]是特定值,parsing,python-3.x,Parsing,Python 3.x,这是一个有点难以恰当表达的问题假设我有名单 ['There,', 'calls', 'the', 'mariner', 'there', 'comes', 'a', 'ship', 'over', 'the', 'line', 'But', 'how', 'can', 'she', 'sail', 'with', 'no', 'wind', 'in', 'her', 'sails', 'and', 'no', 'tide.', 'See...', 'onward', 'she', 'comes

这是一个有点难以恰当表达的问题

假设我有名单

['There,', 'calls', 'the', 'mariner', 'there', 'comes', 'a', 'ship', 'over',
'the', 'line', 'But', 'how', 'can', 'she', 'sail', 'with', 'no', 'wind', 'in',
'her', 'sails', 'and', 'no', 'tide.', 'See...', 'onward', 'she', 'comes', 'Onwards',
'she', 'nears,', 'out', 'of', 'the', 'sun', 'See...', 'she', 'has', 'no', 'crew',]

如何从中提取列表

['sail', 'comes', 'nears', 'has']

也就是说，紧跟在“她”之后的每个元素？这可以通过列表理解来完成吗？

因为此列表组合中有几个边缘情况，例如：

[word for i, word in enumerate(lst[1:], 1) if lst[i-1]=="she"]
# misses the first match if lst[0] == 'she'

[lst[i+1] for i,word in enumerate(lst) if word=='she']
# IndexError if lst[-1] == 'she'

我建议改用正则表达式

import re
words_string = ' '.join(lst)
pat = re.compile(r"""
         \bshe\s      # literal 'she '
         (\w+)\b      # match next word up to the word break""",
                 flags=re.X)
target = pat.findall(words_string)

适用于所有情况：

[li[i+1] for i in range(len(li)-1) if li[i]=='she']

li

是您的列表

对于较大的列表，您可以使用itertools中的，也可以使用以下选项：

def pairs(li):
    # Python 2 -- use izip instead of zip
    from itertools import islice
    for this_item, next_item in zip(li, islice(li, 1, None)):
        yield this_item, next_item

那么你的结果是：

list(that for this, that in pairs(li) if this=='she')

它的优点是不构建中间列表

因为当您

'.join（list）

时，生成的字符串将不会以空格开头——正则表达式也有同样的问题。它与字符串开头的第一个“她”不匹配。建议

re.findall（r'\bshe\s（\w+）\b'.

@dawg谢谢！我的头还在转，因为

\b

和

\s

看起来应该是一样的！然后我每次使用

\s

或

\w

时都要检查我的quickref，因为我忘了它是

空格还是“单词”

或

字符串和空白

Well

\s

是一个文本空白字符，所有字符都有宽度。字符串

she word

不以空格开头，因此字符串将与正则表达式

\sshe

正则表达式元字符

\b

是以下断言字符在定位前进行血清宽度测试的快捷方式一个是

\w |\w

，下一个是相反的

\w |\w

对向下/关闭投票的解释？