使用正则表达式和Python的Unicode替换_Python_Regex_Unicode

使用正则表达式和Python的Unicode替换

python regex unicode

使用正则表达式和Python的Unicode替换,python,regex,unicode,Python,Regex,Unicode,我有一个字符串，如下所示： str1 = "heylisten\uff08there is something\uff09to say \uffa9" str1 = "heylisten\uff08there is something\uff09to say \uffa9" 我需要用两边的空格替换正则表达式检测到的unicode值所需的输出字符串： out = "heylisten \uff08 there is something \uff09 to say \uf

我有一个字符串，如下所示：

str1 = "heylisten\uff08there is something\uff09to say \uffa9"

str1 = "heylisten\uff08there is something\uff09to say \uffa9"

我需要用两边的空格替换正则表达式检测到的unicode值

所需的输出字符串：

out = "heylisten \uff08 there is something \uff09 to say  \uffa9 "

'heylisten\\ uff08 there is something\\ uff09 to say \\ uffa9 '

我使用了re.findall来获取所有匹配项，然后替换它们。它看起来像：

p1 = re.findall(r'\uff[0-9a-e][0-9]', str1, flags = re.U)  
out = str1
for item in p1:
    print item
    print out
    out= re.sub(item, r" " + item + r" ", out)

并输出：

out = "heylisten \uff08 there is something \uff09 to say  \uffa9 "

'heylisten\\ uff08 there is something\\ uff09 to say \\ uffa9 '

上面的错误是它打印了一个额外的“\”并将其与

uff

分开？我甚至尝试了

re.search

，但它似乎只分离了

\uff08

。有更好的办法吗

print re.sub(r"(\\uff[0-9a-e][0-9])", r" \1 ", x)

您可以直接使用此

re.sub

。见演示

输出：

heylisten\uff08有话要说\uff09\uffa9

我有一个字符串，如下所示：

str1 = "heylisten\uff08there is something\uff09to say \uffa9"

str1 = "heylisten\uff08there is something\uff09to say \uffa9"

我需要替换unicode值

您没有任何unicode值。你有一个bytestring

str1 = u"heylisten\uff08there is something\uff09to say \uffa9"
 ...
p1 = re.sub(ur'([\uff00-\uffe9])', r' \1 ', str1)

但你似乎没有替换任何东西！！我不明白你的意思。我希望每场比赛两边都有空格。但是\似乎是分开的。您的导入代码输出了以下内容：u'heylisten\uff08s有话要说\uff09'@Swordy直接使用

print re.sub（r“（\\uff[0-9a-e][0-9]），r“\1”，x）

x是uur字符串。它不起作用。。输出“heylisten\\uff08有话要说\\uff09\\uffa9”是的，我读了，我想我把示例框错了。。它在外面和你一起工作。。