Python 正则表达式匹配“_&引用；仅当它不是'；不要输入用户名_Python_Python 3.x_Regex_Regex Lookarounds_Re

Python 正则表达式匹配“_&引用；仅当它不是'；不要输入用户名

python python-3.x regex

Python 正则表达式匹配“_&引用；仅当它不是'；不要输入用户名,python,python-3.x,regex,regex-lookarounds,re,Python,Python 3.x,Regex,Regex Lookarounds,Re,上下文和解释我正在做一个电报机器人，我想在每个不在用户名中的字符（一个以“”@“”开头的单词）之前添加excape char“\”），比如“@username”，以防止一些标记错误（事实上在电报中，”字符用于使字符串斜体）例如，使用以下字符串： "hello i like this char _ write me lol_ @myusername_" 我只想匹配前两个“”字符，而不想匹配第三个字符问题使用正则表达式模式执行此操作的正确方法是什么预期条件和匹配

上下文和解释

我正在做一个电报机器人，我想在每个不在用户名中的字符（一个以“

”@“

”开头的单词）之前添加excape char

“\”

），比如

“@username”

，以防止一些标记错误（事实上在电报中，

”

字符用于使字符串斜体）

例如，使用以下字符串：

"hello i like this char _ write me lol_ @myusername_"

我只想匹配前两个

“”

字符，而不想匹配第三个字符

问题

使用正则表达式模式执行此操作的正确方法是什么

预期条件和匹配

条件火柴

“

单独：（

”

）是

“

在一个没有

“@”

的单词中：（

“lol”

）是

“\

在一个以

“@”

开头的单词中：（

“@username\

）否

“

在一个单词中，在

“@”

后面包含

“@”

：（

）lol@username_“

）否

“

在一个单词中，在

“@”

前面包含

“@”

：（

“lol”

）是

“\

在这样一个世界中：（

“lol\@username”

）第一：是第二：否

我想你只关心单词开头的

。您可以使用

re.sub

以及

replace

和

（？：\s| ^）[^@]\s+\b

来匹配符合您规范的单词：

import re

s = "hello i like this char _ write me lol_ @myusername_ asd@_a @_asdf"
s = re.sub(r"(?:\s|^)[^@]\S*\b", lambda x: x.group().replace("_", r"\_"), s)
print(s) # => hello i like this char \_ write me lol\_ @myusername_ asd@\_a @_asdf

如果您关心

出现在单词中的任何位置，请尝试

（？：\s|^）[^@\s]+\b

：

s = "he_llo i like this char _ write me lol_ @myusername_ asd@_a @_asdf"
s = re.sub(r"(?:\s|^)[^@\s]+\b", lambda x: x.group().replace("_", r"\_"), s)
print(s) # => he\_llo i like this char \_ write me lol\_ @myusername_ asd@_a @_asdf

根据OP注释，听起来最新的规范是转义

。

除了

之外的任何地方，总之：

>>> s = "he_llo i lol_@username_ _ write me lol_ @myusername_ asd@_a @_asdf"
>>> re.sub(r"(?:\s|^)[^@]+@", lambda x: x.group().replace("_", r"\_"), s)
'he\\_llo i lol\\_@username_ \\_ write me lol\\_ @myusername_ asd@_a @_asdf'

使用PyPi正则表达式库提取：

导入正则表达式
string=“你好，我喜欢这个字符u给我写信lol@myusername”
print（regex.findall（r'）（？您可以在匹配@
后匹配所有非whitspace字符，并使用替代项在组中捕获\
。如果re.sub回调，请检查组1是否存在
如果是，则返回一个转义下划线或已删除的group 1值（这也是一个下划线），否则返回匹配项使其保持不变
@\S+|(_)


输出
\_
lol\_
@username_
lol@username_
lol\_@username
lol\_@username_

根据@OlvinRoght的评论，通过一个小的编辑，这应该可以做到：
Regex
（（？：^\s）（？：[^@\s]*？）（（？：[^@\s]*？）（？=@\s$）

代码示例
重新导入
text='\u嗨，你好，我喜欢这个字符uu给我写lol\uu单词什么的u@myusername\uuu什么的@username\uu'
regex=r“（（？：^\s）（？：[^@\s]*？）（（？：[^@\s]*？）（？=@\s$）”
#保持第一个和最后一个捕获组不变，并将下划线替换为“\\”
subst=“\\1\\\\\\\3”
打印（关于sub（regex、subst、text））

预期输出：
演示

注意：
尽管这样做有效，@TheFourthBird的回答速度更快（我认为更优雅）。
@OlvinRoght谢谢你，但是当删除“@”时，它不起作用。
@OlvinRoght只要最后一点就足够了：（？：^\s）（（？：[^\s]*？）（？：[^\s]*）（？=\s$）
（你是提取还是替换？：“您好，我喜欢这个字符uwrite me lol_u@myusername
，预期输出是什么？@LeonardoScotti如果您将替换字符串从“-”
更改为r”\_“
那么第三个答案将执行OP要求的操作…假设电报用户名仅包含单词字符\w
@LeonardoScotti为什么要在pythex上测试PyPi正则表达式模式？是否导入正则表达式
（安装pip install regex
后）？您应该在启用PCRE选项的情况下对其进行测试，请参阅。很抱歉，我注意到并删除了注释。第三个注释效果良好，但我希望在第一个“lol@username”中使用该注释
待匹配和第二个not@LeonardoScotti现在更改问题通常不是一个好主意，它会使当前的答案无效。您可以删除（？）并使用@\w+（*SKIP）（*F）|
，请参见。它很有效，但我希望在“lol|@username
第一个”中使用它待匹配，第二个未更新，尽管您现在有一些其他答案。这似乎与您的标题不太匹配，但是“（以@开头的单词）”。最好提前添加所有需求，以避免猜测。非常感谢，它很有效
\_
lol\_
@username_
lol@username_
lol\_@username
lol\_@username_

\_hi hello i like this char \_ write me lol\_ \_word something\_ @myusername_ something\_@username_