Python Regex-在单引号之间选择表达式_Python_Regex

Python Regex-在单引号之间选择表达式

python regex

Python Regex-在单引号之间选择表达式,python,regex,Python,Regex,我的目标是从Lorem'hello_kitty.dat'ipsum.中选择像hello_kitty.dat这样的字符串我写的这段代码在某种程度上适用于较小的字符串（从teststring中，在点（\.）之前选择一个或多个（+）字字符（\w），然后选择三个字字符（\w{3}），然后用x）将所选内容拼接起来）但是，即使在\w{3}之后没有完全遵循我的模式，我如何修改代码以选择单引号之间的所有内容呢 teststring可能是 “Lorem'hello_kitty.cmd？command91'i

我的目标是从

Lorem'hello_kitty.dat'ipsum.

中选择像

hello_kitty.dat

这样的字符串

我写的这段代码在某种程度上适用于较小的字符串（从

teststring

中，在点（

\.

）之前选择一个或多个（

）字字符（

\w

），然后选择三个字字符（

\w{3}

），然后用

）将所选内容拼接起来）

但是，即使在

\w{3}

之后没有完全遵循我的模式，我如何修改代码以选择单引号之间的所有内容呢

teststring

可能是

“Lorem'hello_kitty.cmd？command91'ipsum hello_kitty.cmd？command92”

，但在本例中不想选择

hello_kitty.cmd？command92，因为它位于单引号之外。
您可以使用：
import re
teststring = "Lorem 'hello_kitty.cmd?command91' ipsum hello_kitty.cmd?command92"
print(re.sub(r"'\w+\.\w{3}[^']*'", "'x'", teststring))
# => Lorem 'x' ipsum hello_kitty.cmd?command92

见
模式现在匹配：

”
-单引号
\w+
-1个或多个单词字符
\。
-一个点
\w{3}
-3个单词字符
[^']*
-与任何0+字符（而非单引号）匹配的否定字符类
”
-一个单引号
只需使用非贪婪正则表达式：
import re
teststring = "Lorem 'hello_kitty.cmd?command91' ipsum hello_kitty.cmd?command92"
print(re.sub(r"'.*?'", "'x'", teststring)

返回Lorem'x'ipsum hello_kitty.cmd？command9

正则表达式“.*？”
匹配单引号之间的所有内容，但采用尽可能短的字符串。
若要输入我的两分钱，可以使用：
'[^']+' # quotes with a negated character class in between


在Python
中：
import re

string = """
"Lorem 'hello_kitty.dat' ipsum."
"Lorem 'hello_kitty.cmd?command91' ipsum hello_kitty.cmd?command92"
"""

rx = re.compile(r"'[^']+'")
string = rx.sub("x", string)
print(string)

# "Lorem x ipsum."
# "Lorem x ipsum hello_kitty.cmd?command92"

谢谢你的贡献！这是一个非常有趣的正则表达式片段，它确实在这个例子中起作用，但它也会在单引号之间选择所有内容，因此它也会得到“dontselect__me”
，这不是我在项目中想要的。我想在与我在文章中设计的特定关系相匹配的单引号之间进行选择；选择由点分隔的单引号之间的所有内容，这些单引号在点之前有一些单词字符（\w
），在点之后至少有3个单词字符（
）。此代码运行良好，但我不了解其工作原理：[^']*
。这是用来捕捉问号还是在\w{3}
之后的所有不是一个引号的东西？@Clone:我在答案中添加了[^']*
的解释，看：1）[^']
是一个匹配任何0+字符（而非单个引号）的表达式-零或更多字符通过*
量词表示。感谢您的评论！所以它基本上匹配了\w{3}之后的任何内容，这不是一个单引号？我说得对吗？
import re

string = """
"Lorem 'hello_kitty.dat' ipsum."
"Lorem 'hello_kitty.cmd?command91' ipsum hello_kitty.cmd?command92"
"""

rx = re.compile(r"'[^']+'")
string = rx.sub("x", string)
print(string)

# "Lorem x ipsum."
# "Lorem x ipsum hello_kitty.cmd?command92"