如何在python中使用正则表达式搜索类似\x60\xe2\x4b（表示表情符号）的字符串 findall（）函数返回的列表为空：（_Python_Regex_Emoticons

如何在python中使用正则表达式搜索类似\x60\xe2\x4b（表示表情符号）的字符串 findall（）函数返回的列表为空：（

python regex

如何在python中使用正则表达式搜索类似\x60\xe2\x4b（表示表情符号）的字符串 findall（）函数返回的列表为空：（,python,regex,emoticons,Python,Regex,Emoticons,本身不是regexp，但可能会以任何方式帮助您 import re string="b'@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 '" print(re.findall(r"\x[0-9a-z]{2}",string)) def emojis：返回[c代表s中的c，如果ord（

本身不是regexp，但可能会以任何方式帮助您

import re

string="b'@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 '"

print(re.findall(r"\x[0-9a-z]{2}",string))

def emojis：
返回[c代表s中的c，如果ord（c）在范围内（0x1F600，0x1F64F）]
打印（emojis（“hello world您需要re.compile（ur'A\xe2\x80\xa6'，re.UNICODE）

编译一个Unicode正则表达式，并将该模式匹配用于查找、查找全部、sub等。
试试这个。我将您问题中的字符串与标题中的字符串连接起来，形成最终搜索字符串
def emojis(s):
    return [c for c in s if ord(c) in range(0x1F600, 0x1F64F)]

print(emojis("hello world You need to re.compile(ur'A\xe2\x80\xa6',re.UNICODE)


Compile a Unicode regex and use that pattern matching for your find,find all’s,subs,etc.
Try this. I joined the string in your question with that in your title to make the final search string

import re

k = r"@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 for a string like \x60\xe2\x4b(indicating a emoticon) using regular expression in python"
print(k)
print()
p = re.findall(r"((\\x[a-z0-9]{1,}){1,})", k)
for each in p:
    print(each[0])

输出
@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 for a string like \x60\xe2\x4b(indicating a emoticon) using regular expression in python

\xe2\x80\xa6
\x60\xe2\x4b

这里的问题是，您的字符串是Pythonbytes
对象的Python表示形式，这是非常无用的
最可能的情况是，您有一个bytes
对象，如下所示：
b = b'@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 '

s = str(b)

…然后将其转换为字符串，如下所示：
b = b'@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 '

s = str(b)

不要这样做。相反，解码它：
s = b.decode('utf-8')

这将获得实际字符，然后可以轻松地进行匹配，而不是尝试匹配字节表示的字符串表示中的字符，然后根据结果费力地重建实际字符
然而，值得注意的是，\xe2\x80\xa6
不是表情符号，它是一个水平省略号，…
。如果这不是您想要的，那么您在此之前已经损坏了数据。
您能编辑您的代码或添加预期的结果吗？您是如何获得该字符串的？它看起来像是的python表示形式de>字节
对象已被复制/粘贴，并在其周围加上引号以生成字符串。那些\xe2\x80\xa6
被视为字符串转义，并且\x实际上不作为单独的字符存在。例如len（“\xe2\x80\xa6
”）`is 3-它没有斜杠x可查找。除非他在Python 2上，否则他的字符串和模式都已经是Unicode，因此Unicode模式已经打开。即使他在Python 2上，这可能也解决不了他的问题。@abarnert我不知道。我只是浏览了一下，这就是出现的。我期待看到正确的结果回答。Thanx这很有帮助：）@Biswadeep-Roy谢谢。但是试着接受任何对你有用的帖子答案。