Python Regex查找以特定字母开头或结尾的单词_Python_Regex

Python Regex查找以特定字母开头或结尾的单词

python regex

Python Regex查找以特定字母开头或结尾的单词,python,regex,Python,Regex,编写一个名为GetWordsContence的函数，该函数包含一个句子和一个字母，并返回以该字母开头或结尾的单词列表，但不能同时返回这两个单词，而不管字母大小写如何例如： >>> s = "The TART program runs on Tuesdays and Thursdays, but it does not start until next week." >>> getWords(s, "t") ['The', 'Tuesdays', 'Thurs

编写一个名为GetWordsContence的函数，该函数包含一个句子和一个字母，并返回以该字母开头或结尾的单词列表，但不能同时返回这两个单词，而不管字母大小写如何

例如：

>>> s = "The TART program runs on Tuesdays and Thursdays, but it does not start until next week."
>>> getWords(s, "t")
['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

我的尝试：

regex = (r'[\w]*'+letter+r'[\w]*')
return (re.findall(regex,sentence,re.I))

我的输出：

['The', 'TART', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'until', 'next']

使用startswith和endswith方法可以很容易地做到这一点

输出

['The', 'Tuesdays', 'Thursdays,', 'but', 'it', 'not', 'start', 'next']

使用正则表达式更新

解释

正则表达式\b[t]\w+和\w+[t]\b查找以字母t开头和结尾的单词，而\b[t]\w+[t]\b查找以字母t开头和结尾的单词

生成两个单词列表后，只需取这两个列表的交集。

使用startswith和endswith方法可以轻松地完成此操作

输出

['The', 'Tuesdays', 'Thursdays,', 'but', 'it', 'not', 'start', 'next']

使用正则表达式更新

解释

正则表达式\b[t]\w+和\w+[t]\b查找以字母t开头和结尾的单词，而\b[t]\w+[t]\b查找以字母t开头和结尾的单词

生成两个单词列表后，只需取这两个列表的交点。

为什么要使用正则表达式？只需检查第一个和最后一个字符

def getWords(s, letter):
    words = s.split()
    return [a for a,b in ((word, set(word.lower()[::len(word)-1])) for word in words) if letter in b and len(b)==2]

你为什么要用正则表达式呢？只需检查第一个和最后一个字符

def getWords(s, letter):
    words = s.split()
    return [a for a,b in ((word, set(word.lower()[::len(word)-1])) for word in words) if letter in b and len(b)==2]

您可以尝试内置的startswith和endswith函数

>>> string = "The TART program runs on Tuesdays and Thursdays, but it does not start until next week."
>>> [i for i in string.split() if i.lower().startswith('t') or i.lower().endswith('t')]
['The', 'TART', 'Tuesdays', 'Thursdays,', 'but', 'it', 'not', 'start', 'next']

您可以尝试内置的startswith和endswith函数

>>> string = "The TART program runs on Tuesdays and Thursdays, but it does not start until next week."
>>> [i for i in string.split() if i.lower().startswith('t') or i.lower().endswith('t')]
['The', 'TART', 'Tuesdays', 'Thursdays,', 'but', 'it', 'not', 'start', 'next']

\b检测单词中断。详细模式允许多行正则表达式和注释。请注意，[^\W]与\W相同，但要匹配\W（除了某个字母），您需要[^\W{letter}]

输出：

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

\b检测单词中断。详细模式允许多行正则表达式和注释。请注意，[^\W]与\W相同，但要匹配\W（除了某个字母），您需要[^\W{letter}]

输出：

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

如果您需要用于此目的的正则表达式，请使用：

regex = r'\b(#\w*[^#\W]|[^#\W]\w*#)\b'.replace('#', letter)

替换是为了避免重复的冗长+字母+

因此，代码如下所示：

import re

def getWords(sentence, letter):
    regex = r'\b(#\w*[^#\W]|[^#\W]\w*#)\b'.replace('#', letter)
    return re.findall(regex, sentence, re.I)

s = "The TART program runs on Tuesdays and Thursdays, but it does not start until next week."
result = getWords(s, "t")
print(result)

输出：

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

解释我已经将其用作实际字母的占位符，在实际使用之前，它将在正则表达式中被替换

\断字 \w*：0个或多个字母或下划线 [^\W]：不是给定字母的字母 |：逻辑或。左侧匹配以字母开头但不以字母结尾的单词，右侧匹配相反的大小写。

如果您需要用于此目的的正则表达式，请使用：

regex = r'\b(#\w*[^#\W]|[^#\W]\w*#)\b'.replace('#', letter)

替换是为了避免重复的冗长+字母+

因此，代码如下所示：

import re

def getWords(sentence, letter):
    regex = r'\b(#\w*[^#\W]|[^#\W]\w*#)\b'.replace('#', letter)
    return re.findall(regex, sentence, re.I)

s = "The TART program runs on Tuesdays and Thursdays, but it does not start until next week."
result = getWords(s, "t")
print(result)

输出：

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

['The', 'Tuesdays', 'Thursdays', 'but', 'it', 'not', 'start', 'next']

解释我已经将其用作实际字母的占位符，在实际使用之前，它将在正则表达式中被替换

\断字 \w*：0个或多个字母或下划线 [^\W]：不是给定字母的字母 |：逻辑或。左侧匹配以字母开头但不以字母结尾的单词，右侧匹配相反的大小写。

可能有帮助：如果您已经收到答复，请不要以使答案无效的方式修改您的问题。可能有帮助：如果您已经收到答复，请不要以使答案无效的方式修改您的问题。这将更改结果的大小写。我已更新了我的答案。我要求重新考虑否决投票！这改变了结果的大小写。我已经更新了我的答案。我要求重新考虑否决投票！我被要求只使用正则表达式。我被要求只使用正则表达式。谢谢马克，它确实解决了我的问题。我用.format学到了一个新东西。但是我迷失在表达中，我明白了。但我在中表示什么？我以及为什么在或块中使用\W。@AsheemChhetri添加了注释。谢谢你，马克，它确实解决了我的问题。我用.format学到了一个新东西。但是我迷失在表达中，我明白了。但“我”在中表示什么？我以及您为什么在或块中使用\W。@AsheemChhetri添加了注释。非常感谢我了解到的内容。请在上述答案中使用格式，并从您的答案中使用.replace。我的问题是：[^\W]|[^\W][^\W][^\W]：它与字符串中的一个字符或一个单词相关吗？它与一个字符相关，因为后面没有+或*。它说：匹配一个不是我的字母，也不是非字母的字符，这可以归结为匹配一个与我的字母不同的字母。OR运算符|扩展得更广：它适用于括号之间的部分或整个图案（如果没有的话），因此它表示它应该匹配\w*[^\w]或[^\w]\w*。非常感谢我了解的内容。在上面的一个答案中使用格式。从您的答案中替换。我的问题是：[^\W]|[^\W][^\W][^\W]：它与字符串中的一个字符或一个单词相关吗？它与一个字符相关，因为后面没有+或*。它说：匹配一个不是我的字母，也不是非字母的字符，这可以归结为匹配一个与我的字母不同的字母。OR运算符|进一步扩展除此之外：它适用于括号之间的部分或整个模式（如果没有），因此它表示它应该匹配\w*[^\w]或[^\w]\w*。