在Python3中查找字符串中出现的所有单词_Python_Regex_Python 3.x

在Python3中查找字符串中出现的所有单词

python regex python-3.x

在Python3中查找字符串中出现的所有单词,python,regex,python-3.x,Python,Regex,Python 3.x,我试图在一句话中找到所有包含“地狱”的单词。以下字符串中出现了3次。但是，重新搜索只返回前两个事件。我尝试了findall和search。有人能告诉我这里怎么了吗 >>> s = 'heller pond hell hellyi' >>> m = re.findall('(hell)\S*', s) >>> m.group(0) Traceback (most recent call last): File "<stdin>"

我试图在一句话中找到所有包含“地狱”的单词。以下字符串中出现了3次。但是，重新搜索只返回前两个事件。我尝试了findall和search。有人能告诉我这里怎么了吗

>>> s = 'heller pond hell hellyi'
>>> m = re.findall('(hell)\S*', s)
>>> m.group(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'group'
>>> m = re.search('(hell)\S*', s)
>>> m.group(0)
'heller'
>>> m.group(1)
'hell'
>>> m.group(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: no such group
>>>

>s='heller-pond-hellyi'
>>>m=re.findall（'（地狱）\S*'，S）
>>>m组（0）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
AttributeError:“列表”对象没有属性“组”
>>>m=重新搜索（'（地狱）\S*'，S）
>>>m组（0）
“海勒”
>>>m组（1）
“见鬼”
>>>m组（2）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
索引器：没有这样的组
>>>

您可以使用str.split并查看子字符串是否在每个单词中：

s = 'heller pond hell hellyi'

print([w for w in s.split() if "hell" in w])

您的正则表达式找不到

hell

，因为您只匹配其他非空格字符前面的

hell

。相反，只需寻找一个字面的

hell

——没什么特别的

In [3]: re.findall('hell', 'heller pond hell hellyi')
Out[3]: ['hell', 'hell', 'hell']

编辑

根据你的评论，如果你在单词的中间找到了整个单词，你就想把它全部归还。在这种情况下，您应该使用

零个或多个量词

In [4]: re.findall(r"\S*hell\S*", 'heller pond hell hellyi')
Out[4]: ['heller', 'hell', 'hellyi']

换言之：

re.compile(r"""
    \S*          # zero or more non-space characters
    hell         # followed by a literal hell
    \S*          # followed by zero or more non-space characters""", re.X)

请注意，Padraic的答案肯定是解决这一问题的最佳方法：

[word for word in "heller pond hell hellyi".split() if 'hell' in word]

您可以使用

re.findall

并搜索两侧均为零个或多个单词字符的

hell

：

>>> import re
>>> s = 'heller pond hell hellyi'
>>> re.findall('\w*hell\w*', s)
['heller', 'hell', 'hellyi']
>>>

也许是我，但我很少使用正则表达式。Python3有大量的文本函数，使用内置函数有什么问题

'heller pond hell hellyi'.count('hell')

我看到的唯一缺点是，这种方式我从未真正学会使用正则表达式。：-）

谢谢！有没有办法通过正则表达式来实现？我还在学习，真的很想用RE试试。@Vin，是的，Icodez回答，简单地用findall就行了。如果您不关心效率，请使用正则表达式，但我希望它返回“heller”、“hell”、“hellyi”。因此，我必须给出\S或其他一些转义字符。“in”操作符将为您提供这些字符。这一个非常有效。你知道为什么这是错误的吗？>>m=重新搜索（“（地狱）\S*”，S）。我只返回前2个事件。“hellyi”没有返回。不，它没有返回两个事件<代码>重新搜索仅获取第一个。您得到的是

hell

，因为这是捕获组匹配的值。但它仍然是heller的一部分。